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Claim(s) searched completely: 
1-24,26-27 



Claim(s) no't searched: 
25 



Reason for the limitation of the search (non-patentable invention(s)) : 
Article 52 (2}(d) EPC - Presentation of information (claim 25) 



Further limitation of the search 

Claim(s) searched completely: 
26,27 

Claim(s) searched incompletely: 
1-5,8,10-13,15-24 

Claim(s) not searched: 
6,7,9,14 

Reason for the limitation of the search: 

The present set of claims is unclear (Art. 84 EPC), the reasons being as 
follows: 

Claims 1 and 2 relate to the detection of at least two neoplasia markers 
which are defined such that they include a huge number of possible 
markers and combinations thereof. Claims 4-7 relate to markers on 
specific chromosomal regions, which still give rise to many possible 
combinations: the 17ql2-q24 region alone appears to encompass about 50 
known genes (cf Table 1), many VNTRs,(cf claim 9), SNPs (cf claim 10) and 
possibly other markers. 

The markers of claim 1 are defined only such that they are located on 
"one chromosomal region which is altered in malignant neoplasia", which 
renders it impossible for the skilled reader to determine what markers 
fall under the scope of said claim: the definition of "altered 
chromosomal region" given in the description (p. 17, 1.26-31), does not 
clearly define how a region is delimited, and the limits of such regions 
appear to vary according to cell type (cf Fig. 4). The same applies also 
to independent claims 2 (part a) and 8. Claim 2, part b, further refers 
to five conditions which the two or more markers should fulfil, yet these 
are not clearly delimited, so that it is impossible for a meaningful 
search to be conducted. 

The claims as a whole are neither clear nor concise. Claim 1 requires 
that the two or more markers are located on one altered chromosomal 
region, whereas in claim 2 the markers may be on more than one region, 
but must interact in some way. Claim 8 requires neither cblocalisation 
nor interaction. Claims 11 and 12 appear to relate simply to detection of 
one or more marker genes located on the 17ql2-q24 locus. Thus, it is not 
possible to determine the essential features of the present application. 
Thus, for the reasons given above, the claims are unclear and inconcise 
(Art. 84 EPC), to the extent that no meaningful search could be carried 
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out across the entire scope of the claims. The search was restricted 
initially to the 43 human genes (cf Tables 2 and 3) which are 
co-amplified from chromosomal region 17ql2-q24 in neoplastic lesions from 
breast cancer tissue (cf p. 3, 1.7-14). Due to the enormous number of 
possible combinations of two or more of said marker genes, the search has 
been further restricted to the groups of genes which have been clearly 
identified as interacting (cf p. 21, 1.1-4), namely HER-2/neu, GRB7, CrkRS 
and CDC6 (SEQ ID NOs 10, 11, 5, 18; being part of identical pathways); 
HER-2/neu, THRA and RARA (SEQ ID NOs 10, 15, 19; influencing the 
expression of each other); PPARGBP, THRA, RARA, NR1D1 (SEQ ID NOs 4, 15, 
19, 16; interacting with each other); HER-2/neu and GRB7 (SEQ 10 NOs 10, 
11; interacting with each other) (see also p. 60, 1.18 - p. 62, 1.5). 
Further limitations to the search due to lack of clarity (Art. 84 EPC) are 
as follows: 

Claim 11, part b, relates to a polynucleotide encoding a polypeptide 
exhibiting "the same biological function" as the sequences in Table 2 or 
3. This definition does not enable the claimed polynucleotide to be 
identified, so that part b has not been searched. Part c is clear only 
insofar as it relates to a degenerate polynucleotide of part a encoding 
the same polypeptide, and has only been searched in this respect. Part d 
relates to fragments and derivatives or variations of the foregoing 
polynucleotide sequences but is only clear and has been searched only 
insofar as it refers to the sequences in part a. The polypeptides of part 
e are only clear and have been searched only in that they relate to the 
specific sequences encoded by the polynucleotides given in part a, i.e. 
the polypeptide sequences listed in part f. The same applies to 
independent claims 12, 15-20 and 22-24. 

Additionally, claim 15 is defined only in terms of the polynucleotides or 
polypeptides which are to be detected, and not in terms of the features 
of the detection agents themselves. Therefore, said claim has been 
searched only insofar as the claim is clear, i.e. wherein the detection 
agents are antibodies or polynucleotide probes (cf description p. 9, 
1.10-16). 

Additionally, claim 2Q refers (step h) to an antibody °capable of binding 
to" the specified target; this is, and has been searched only insofar as 
it relates to an antibody specific for the specified target. Said claim 
also refers (step i) to a reagent identified by screening methods of 
claims 17-19, yet provides no features of such reagents per se; 
therefore, this feature is unclear and has been searched only insofar as 
such reagents have been identified, namely antibodies and antisense 
oligonucleotides (cf p. 140, 1.9-15). This same objection applies equally 
to claims 22 and 23, directed to reagents identified by the methods of 
claims 17-19, as well as to claim 24 insofar as it refers to the reagents 
of claims 22 and 23. 

Claim 14 is directed to a diagnostic kit, the only feature of which is 
instructions for carrying out a method. Said feature is not a limiting 
technical feature. In the absence of any indication of the specific 
composition of the kit, no meaningful search can be carried out for claim 
14. 
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Description , 

TECHNICAL FIELD OF THE INVENTION 

5 [0001] The invention relates to methods and compositions for the prediction, diagnosis, prognosis, prevention and 
treatment of neoplastic disease. Neoplastic disease is often caused by chromosomal rearrangements which lead to 
over- or underexpression of the rearranged genes. The invention discloses genes which are overexpressed in neo- 
plastic tissue and are useful as diagnostic markers and targets for treatment. Methods are disclosed for predicting, 
diagnosing and prognosing as well as preventing and treating neoplastic disease. 

10 

BACKGROUND OF THE INVENTION 

[0002] Chromosomal aberrations (amplifications, deletions, inversions, insertions, translocations and/or viral inte- 
grations) are of importance for the development of cancer and neoplastic lesions, as they account for deregulations 

15 of the respective regions. Amplifications of genomic regions have been described, in which genes of importance for 
growth characteristics, differentiation, invasiveness or resistance to therapeutic intervention are located. One of those 
regions with chromosomal aberrations is the region carrying the HER-2/neu gene which is amplified. in breast cancer 
patients. In approximately 25% of breast cancer patients the HER-2/neu gene is overexpressed due to gene amplifi- 
cation. HER-2/neu overexpression correlates with a poor prognosis (relapse, overall survival, sensitivity to therapeu- 

20 tics). The importance of HER-2/neu for the prognosis of the disease progression has been described [Gusterson et 
al., 1992, (1)]. Gene specific antibodies raised against HER-2/neu (Herceptin™) have been generated to treat the 
respective cancer patients. However, only about 50% of the patients benefit from the antibody treatment with Hercept- 
in™, which is most often combined with chemotherapeutic regimen. The discrepancy of HER-2/neu positive tumors 
(overexpressing HER-2/neu to similar extent) with regard to responsiveness to therapeutic intervention suggest, that 

25 there might be additional factors or genes being involved in growth and apoptotic characteristics of the respective 
tumor tissues. There seems to be no monocausal relationship between overexpression of the growth factor receptor 
HER-2/neu and therapy outcome. In line with this the measurement of commonly used tumor markers such as estrogen 
receptor, progesterone receptor, p53 and Ki-67 do provide only very limited information on clinical outcome of specific 
therapeutic decisions. Therefore there is a great need for a more detailed diagnostic and prognostic classification of 

30 tumors to enable improved therapy decisions and prediction of survival of the patients. The present invention addresses 
the need for additional markers by providing genes, which expression is deregulated in tumors and correlates with 
clinical outcome. One focus is the deregulation of genes present in specific chromosomal regions and their interaction 
in disease development and drug responsiveness. 

[0003] HER-2/neu and other. markers for neoplastic disease are commonly assayed with diagnostic methods such 
35 as immunohistochemistry (IHC) (e.g. HercepTest™ from DAKO Inc.) and Fluorescence-ln-Situ-Hybridization (FISH) 
(e.g. quantitative measurement of the HER-2/neu and Topoisomerase II alpha with a fluorescence-//i-stfiv-Hybridization 
kit from VYSIS). Additionally HER-2/neu can be assayed by detecting HER-2/neu fragments in serum with an ELISA 
test (BAYER Corp.) or a with a quantitative PCR kit which compares the amount of HER-2/neu gene with the amount 
of a non-amplified control gene in order to detect HER-2/neu gene amplifications (ROCHE). These methods, however, 
40 exhibit multiple disadvantages with regard to sensitivity, specificity, technical and personnel efforts, costs, time con- 
sumption, inter-lab reproducibility. These methods are also restricted with regard to measurement of multiple param- 
eters within one patient sample ("multiplexing"). Usually only about 3 to 4 parameters (e.g. genes or gene products) 
can be detected per tissue slide. Therefore, there is a need to develop a fast and simple test to measure simultaneously 
multiple parameters in one sample. The present invention addresses the need for a fast and simple high-resolution 
45 method, that is able to detect multiple diagnostic and prognostic markers simultaneously. 

SUMMARY OF THE INVENTION 

[0004] The present invention is based on discovery that chromosomal alterations in cancer tissues can lead to chang- 
50 es in the expression of genes that are encoded by the altered chromosomal regions. Exemplary 43 human genes have 
been identified that are co-amplified in neoplastic lesions from breast cancer tissue resulting in altered expression of 
several of these genes (Tables 1 to 4). These 43 genes are differentially expressed in breast cancer states, relative to 
their expression in normal, or non-breast cancer states. The present invention relates to derivatives, fragments, ana- 
logues and homologues of these genes and uses or methods of using of the same. 
55 [0005] The present invention further relates to novel preventive, predictive, diagnostic, prognostic and therapeutic 
compositions and uses for malignant neoplasia and breast cancer in particular. Especially membrane bound marker 
gene products containing extracellular domains can be a particularly useful target for treatment methods as well as 
diagnostic and clinical monitoring methods. 
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[0006] It is a discovery of the present invention that several of these genes are characterized in that their gene 
products functionally interact in signaling cascades or by directly or indirectly influencing each other. This interaction 
is important for the normal physiology of certain non-neoplastic tissues (e.g. brain or neurogenic tissue). The deregu- 
lation of these genes in neoplastic lesions where they are normally exhibit of different level of activity or are not active, 
5 however, results in pathophysiology and affects the characteristics of the disease-associated tissue. 

[0007] The present invention further relates to methods for detecting these deregulations in malignant neoplasia on 
DNA and mRNA level. 

[0008] The present invention further relates to a method for the detection of chromosomal alterations characterized 
in that the relative abundance of individual mRNAs, encoded by genes, located in altered chromosomal regions is 
10 detected. 

[0009] The present invention further relates to a method for the detection of the flanking breakpoints of named chro- 
mosomal alterations by measurement of DNA copy number by quantitative PCR or DNA-Arrays and DNA sequencing. 
[0010] A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of DNA sequences 
flanking named genomic breakpoint or are located within such. 

15 [0011] The present invention further relates to a method for the detection of chromosomal alterations characterized 
in that the copy number of one or more genomic nucleic acid sequences located within an altered chromosomal region 
(s) is detected by quantitative PCR techniques (e.g. TaqMan™, Lightcycler™ and iCycler™). 
[0012] The present invention further relates to a method for the prediction, diagnosis or prognosis of malignant ne- 
oplasia by the detection of at least 2 markers whereby the markers are genes and fragments thereof or genomic nucleic 

20 acid sequences that are located on one chromosomal region which is altered in malignant neoplasia and breast cancer 
in particular. 

[0013] The present invention also discloses a method for the prediction, diagnosis or prognosis of malignant neo- 
plasia by the detection of at least 2 markers whereby the markers are located on one or more chromosomal region(s) 
which is/are altered in malignant neoplasia; and the markers interact as (i) receptor and ligand or (ii) members of the 

25 same signal transduction pathway or (iii)members of synergistic signal transduction pathways or (iv) members of an- 
tagonistic signal transduction pathways or (v) transcription factor and transcription factor binding site. 
[0014] Also dislcosed is a method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection 
of at least one marker whereby the marker is a VNTR, SNP, RFLP or STS which is located on one chromosomal region 
which is altered in malignant neoplasia due to amplification and the marker is detected in (a) a cancerous and (b) a 

30 non cancerous tissue or biological sample from the same individual. A preferred embodiment is the detection of at 
least one VNTR marker of Table 6 or at least on SNP marker of Table 4 or combinations thereof.. Even more preferred 
can the detection, quantification and sizing of such polymorphic markers be achieved by methods of (a) for the com- 
parative measurement of amount and size by PCR amplification and subsequent capillary electrophoresis, (b) for 
sequence determination and allelic discrimination by gel electrophoresis (e.g. SSCP, DGGE), real time kinetic PCR, 

35 direct DNA sequencing, pyro-sequencing, mass-specific allelic discrimination or resequencing by DNA array technol- 
ogies, (c) for the dertermination of specific restriction patterns and subsequent electrophoretic separation and (d) for 
allelic discrimination by allel specific PCR (e.g. ASO). An even more favorable detection of a hetrozygous VNTR, SNP, 
RFLP or STS is done in a multiplex fashion, utilizing a variety of labeled primers (e.g. fluorescent, radioactive, bioactive) 
and a suitable capillary electrophoresis (CE) detection system. 

40 [0015] In another embodiment the expression of these genes can be detected with DNA-arrays as described in 
W09727317 and US6379895. 

[0016] In a further embodiment the expression of these genes can be detected with bead based direct flourescent 
readout techniques such as described in WO9714028 and WO9952708. 

[0017] In one embodiment, the invention pertains to a method of determining the phenotype of a cell or tissue, com- 
45 prising detecting the differential expression, relative to a normal or untreated cell, of at least one polynucleotide com- 
prising SEQ ID NO: 2 to 6, 8, 9, 11 to 16, 18, 19 or 21 to 26 or 53 to 75, wherein the polynucleotide is differentially 
expressed by at least about 1 .5 fold, at least about 2 fold or at least about 3 fold. 

[0018] In a further aspect the invention pertains to a method of determining the phenotype of a cell or tissue, com- 
prising detecting the differential expression, relative to a normal or untreated cell, of at least one polynucleotide which 
50 hybridizes under stringent conditions to one of the polynucleotides of SEQ ID NO: 2 to 6, 8, 9, 11 to 16, 18, 19 or 21 
to 26 or 53 to 75 and encodes a polypeptide exhibiting the same biological function as given in Table 2 or 3 for the 
respective polynucleotide, wherein the polynucleotide is differentially expressed by at least at least about 1 .5 fold , at 
least about 2 fold or at least about 3 fold. 

[0019] In another embodiment of the invention a polynucleotide comprising a polynucleotide selected from SEQ ID 
55 NO: 2 to 6, 8, 9, 11 to 16, 18, 19 or 21 to 26 and 53 to 75 or encoding one of the polypeptides with SEQ ID NO: 28 to 
32, 34, 35, 37 to 42, 44, 45 or 47 to 52 or 76 to 98 can be used to identify cells or tissue in individuals which exhibit a 
phenotype predisposed to breast cancer or a diseased phenotype, thereby (a) predicting whether an individual is at 
risk for the development, or (b) diagnosing whether an individual is having, or (c) prognosing the progression or the 



3 



EP 1 365 034 A2 



outcome of the treatment malignant neoplasia and breast cancer in particular. 

[0020] In yet another embodiment the invention provides a method for identifying genomic regions which are altered 
on the chromosomal level and encode genes that are linked by function and are differentially expressed in malignant 
neoplasia and breast cancer in particular. 
5 [0021] In yet another embodiment the invention provides the genomic regions 17q12, 3p21 and 12q13 for use in 
prediction, diagnosis and prognosis as well as prevention and treatment of malignant neoplasia and breast cancer. In 
particular not only the intragenic regions, but also intergenic regions, pseudogenes or non-transcribed genes of said 
chromosomal regions can be used for diagnostic, predictive, prognostic and preventive and therapeutic compositions 
and methods. 

w [0022] In yet another embodiment the invention provides methods of screening for agents which regulate the activity 
of a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynu- 
cleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. A test compound is contacted 
with a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a poly- 
nucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. Binding of the test compound 

15 to the polypeptide is detected. A test compound which binds to the polypeptide is thereby identified as a potential 
therapeutic agent for the treatment of malignant neoplasia and more particularly breast cancer. 
[0023] In even another embodiment the invention provides another method of screening for agents which regulate 
the activity of a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by 
a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. A test compound is 

20 contacted with a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded 
by a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. A biological activity 
mediated by the polypeptide is detected. A test compound which decreases the biological activity is thereby identified 
as a potential therapeutic agent for decreasing the activity of the polypeptide encoded by a polypeptide comprising a 
polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide comprising a polynucleotide 

25 selected from SEQ ID NO: 1 to 26 and 53 to 75 in malignant neoplasia and breast cancer in particular. A test compound 
which increases the biological activity is thereby identified as a potential therapeutic agent for increasing the activity 
of the polypeptide encoded by a polypeptide selected from one of the polypeptides with SEQ ID NO: 27 to 52 and 76 
to 98 or encoded by a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 in 
malignant neoplasia and breast cancer in particular. 

30 [0024] In another embodiment the invention provides a method of screening for agents which regulate the activity 
of a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. A test compound is 
contacted with a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. Binding 
of the test compound to the polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 
75 is detected. A test compound which binds to the polynucleotide is thereby identified as a potential therapeutic agent 

35 for regulating the activity of a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 
75 in malignant neoplasia and breast cancer in particular. 

[0025] The invention thus provides polypeptides selected from one of the polypeptides with SEQ ID NO: 27 to 52 
and 76 to 98 or encoded by a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 
to 75 which can be used to identify compounds which may act, for example, as regulators or modulators such as 

40 agonists and antagonists, partial agonists, inverse agonists, activators, co-activators and inhibitors of the polypeptide 
comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide comprising 
a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. Accordingly, the invention provides reagents and 
methods for regulating a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or 
encoded by a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 in malignant 

45 neoplasia and more particularly breast cancer. The regulation can be an up-ordown regulation. Reagents that modulate 
the expression, stability or amount of a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 
and 53 to 75 or the activity of the polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 
98 or encoded by a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 can 
be a protein, a peptide, a peptidomimetic, a nucleic acid, a nucleic acid analogue (e.g. peptide nucleic acid, locked 

so nucleic acid) or a small molecule. Methods that modulate the expression, stability or amount of a polynucleotide com- 
prising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or the activity of the polypeptide comprising 
a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide comprising a polynu- 
cleotide selected from SEQ ID NQ:1 to 26 and 53 to 75 can be gene replacement therapies, antisense, ribozyme and 
triplex nucleic acid approaches. 

55 [0026] In one embodiment of the invention provides antibodies which specifically bind to a full-length or partial 
polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide 
comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or a polynucleotide comprising a polynu- 
cleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 for use in prediction, prevention, diagnosis, prognosis and 
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treatment of malignant neoplasia and breast cancer in particular. 

[0027] Yet another embodiment of the invention is the use of a reagent which specifically binds to a polynucleotide 
comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or a polypeptide comprising a polypeptide 
selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide comprising a polynucleotide selected 
5 from SEQ ID NO: 1 to 26 and 53 to 75 in the preparation of a medicament for the treatment of malignant neoplasia 
and breast cancer in particular. 

[0028] Still another embodiment is the use of a reagent that modulates the activity or stability of a polypeptide com- 
prising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide comprising a 
polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or the expression, amount or stability of a polynucleotide 
10 comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 in the preparation of a medicament for 
the treatment of malignant neoplasia and breast cancer in particular. 

[0029] Still another embodiment of the invention is a pharmaceutical composition which includes a reagent which 
specifically binds to a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or 
a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide 
15 comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75, and a pharmaceutical^ acceptable carrier. 
[0030] Yet another embodiment of the invention is a pharmaceutical composition including a polynucleotide com- 
prising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or encoding a polypeptide comprising a 
polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98. 

[0031] In one embodiment, a reagent which alters the level of expression in a cell of a polynucleotide comprising a 
20 polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or encoding a polypeptide comprising a polypeptide 
selected from SEQ ID NO: 27 to 52 and 76 to 98, or a sequence complementary thereto, is identified by providing a 
cell, treating the cell with a test reagent, determining the level of expression in the cell of a polynucleotide comprising 
a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or encoding a polypeptide comprising a polypeptide 
selected from SEQ ID NO: 27 to 52 and 76 to 98 or a sequence complementary thereto, and comparing the level of 
25 expression of the polynucleotide in the treated cell with the level of expression of the polynucleotide in an untreated 
cell, wherein a change in the level of expression of the polynucleotide in the treated cell relative to the level of expression 
of the polynucleotide in the untreated cell is indicative of an agent which alters the level of expression of the polynu- 
cleotide in a cell. 

[0032] The invention further provides a pharmaceutical composition comprising a reagent identified by this method. 
30 [0033] Another embodiment of the invention is a pharmaceutical composition which includes a polypeptide compris- 
' ing a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or which is encoded by a polynucleotide comprising 
a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. 

[0034] A further embodiment of the invention is a pharmaceutical composition comprising a polynucleotide including 
a sequence which hybridizes under stringent conditions to a polynucleotide comprising a polynucleotide selected from 
35 SEQ ID NO: 1 to 26 and 53 to 75 and encoding a polypeptide exhibiting the same biological function as given for the 
respective polynucleotide in Table 2 or 3, or encoding a polypeptide comprising a polypeptide selected from SEQ ID 
NO: 27 to 52 and 76 to 98. Pharmaceutical compositions, useful in the present invention may further include fusion 
proteins comprising a polypeptide comprising a polynucleotide selected from SEQ ID NO: 27 to 52 and 76 to 98, or a 
fragment thereof, antibodies, or antibody fragments 
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[0035] 



45 Fig. 1 shows a sketch of the chromosome 17 with G-banding pattern and cytogenetic positions. In the blow out at 
the lower part of the figure a detailed view of the chromosomal area of the iong arm of chromosome 17 (17q 
12-21.1) is provided. Each vertical rectangle depicted in medium gray, represents a gene as labeled below 
or above the individual position. The order of genes depicted in this graph has been deduced from experiments 
questioning the amplification an over expression and from public available data (e.g. UCSC, NCBI or Ensem- 

50 ble). 

Fig. 2 shows the same region as depicted before in Fig. 1 and a cluster representation of the individual expression 
values measured by DNA-chip hybridization. The gene representing squares are indicated by a dotted line. 
In the upper part of the cluster representation 4 tumor cell lines, of which two harbor a known HER-2/neu 
55 over expression (SKBR3 and AU565), are depicted with their individual expression profiles. Not only the HER- 

2/neu gene shows a clear over expression but as provided by this invention several other genes with in the 
surrounding. In the middle part of the cluster representation expression data obtained from immune histo- 
chemically characterized tumor samples are presented. Two of the depicted probes show a significant over 
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expression of genes marked by the white rectangles. For additional information and comparison expression 
profiles of several non diseased human tissues (RNAs obtained from Clontech Inc.) are provided. Closest 
relation to the expression profile of HER-2/neu positive tumors displays human brain and neural tissue. 

provides data from DNA amplification measurements by qPCR (e.g. TaqMan). Data indicates that in several 
analyzed breast cancer cell lines harbor amplification of genes which were located in the previously described 
region (ARCHEON). Data were displayed for each gene on the x-axis and 40-Ct at the y-axis. Data were 
normalized to the expression level of GAPDH as seen in the first group of columns. 

represents a graphical overview on the amplified regions and provides information on the length of the indi- 
vidual amplification and over expression in the analyzed tumor cell lines. The length of the amplification and 
the composition of genes has a significant impact on the nature of the cancer cell and on the responsiveness 
on certain drugs, as described elsewhere. 

15 DETAILED DESCRIPTION OF THE INVENTION 

Definitions 

[0036] "Differential expression", as used herein, refers to both quantitative as well as qualitative differences in the 
20 genes' expression patterns depending on differential development and/or tumor growth. Differentially expressed genes 
may represent "marker genes," and/or "target genes". The expression pattern of a differentially expressed gene dis- 
closed herein may be utilized as part of a prognostic or diagnostic breast cancer evaluation. Alternatively, a differentially 
expressed gene disclosed herein may be used in methods for identifying reagents and compounds and uses of these 
reagents and compounds for the treatment of breast cancer as well as methods of treatment. 
25 [0037] "Biological activity" or "bioactivity" or "activity" or "biological function", which are used interchangeably, herein 
mean an effector or antigenic function that is directly or indirectly performed by a polypeptide (whether in its native or 
denatured conformation), or by any fragment thereof in vivo or in vitro. Biological activities include but are not limited 
to binding to polypeptides, binding.to other proteins or molecules, enzymatic activity, signal transduction, activity as a 
DNA binding protein, as a transcription regulator, ability to bind damaged DNA, etc. A bioactivity can be modulated by 
30 directly affecting the subject polypeptide. Alternatively, a bioactivity can be altered by modulating the level of the 
polypeptide, such as by modulating expression of the corresponding gene. 

[0038] The term "marker" or "biomarker" refers a biological molecule, e.g., a nucleic acid, peptide, hormone, etc., 
whose presence or concentration can be detected and correlated with a known condition, such as a disease state. 
[0039] "Marker gene," as used herein, refers to a differentially expressed gene which expression pattern may be 

35 utilized as part of predictive, prognostic or diagnostic malignant neoplasia or breast cancer evaluation, or which, alter- 
natively, may be used in methods for identifying compounds useful for the treatment or prevention of malignant neo- 
plasia and breast cancer in particular. A marker gene may also have the characteristics of a target gene. 
[0040] "Target gene", as used herein, refers to a differentially expressed gene involved in breast cancer in a manner 
by which modulation of the level of target gene expression or of target gene product activity may act to ameliorate 

*o symptoms of malignant neoplasia and breast cancer in particular. A target gene may also have the characteristics of 
a marker gene. 

[0041] The term "biological sample", as used herein, refers to a sample obtained from an organism or from compo- 
nents (e.g., cells) of an organism. The sample may be of any biological tissue or fluid. Frequently the sample will be 
a "clinical sample" which is a sample derived from a patient. Such samples include, but are not limited to, sputum, 
45 blood, blood cells (e.g., white cells), tissue or fine- needle biopsy samples, cell-containing bodyfluids, free floating 
nucleic acids, urine, peritoneal fluid, and pleural fluid, or cells therefrom. Biological samples may also include sections 
of tissues such as frozen sections taken for histological purposes. 

[0042] By "array" or "matrix" is meant an arrangement of addressable locations or "addresses" on a device. The 
locations can be arranged in two dimensional arrays, three dimensional arrays, or other matrix formats. The number 

50 of locations can range from several to at least hundreds of thousands. Most importantly, each location represents a 
totally independent reaction site. Arrays include but are not limited to nucleic acid arrays, protein arrays and antibody 
arrays. A "nucleic acid array" refers to an array containing nucleic acid probes, such as oligonucleotides, polynucle- 
otides or larger portions of genes. The nucleic acid on the array is preferably single stranded. Arrays wherein the probes 
are oligonucleotides are referred to as "oligonucleotide arrays" or "oligonucleotide chips." A "microarray," herein also 

55 refers to a "biochip" or "biological chip", an array of regions having a density of discrete regions of at least about 
100/cm 2 , and preferably at least about 1000/cm 2 . The regions in a microarray have typical dimensions, e.g., diameters, 
in the range of between about 1 0-250 um, and are separated from other regions in the array by about the same distance. 
A "protein array" refers to an array containing polypeptide probes or protein probes which can be in native form or 
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denatured. An "antibody array" refers to an array containing antibodies which include but are not limited to monoclonal 
antibodies (e.g. from a mouse), chimeric antibodies, humanized antibodies or phage antibodies and single chain an- 
tibodies as well as fragments from antibodies. 

[0043] The term "agonist", as used herein, is meant to refer to an agent that mimics or upregulates (e.g., potentiates 
5 or supplements) the bioactivity of a protein. An agonist can be a wild-type protein or derivative thereof having at least 
one bioactivity of the wild-type protein. An agonist can also be a compound that upregulates expression of a gene or 
which increases at least one bioactivity of a protein. An agonist can also be a compound which increases the interaction 
of a polypeptide with another molecule, e.g.. a target peptide or nucleic acid. 

[0044] The term "antagonist" as used herein is meant to refer to an agent that downregulates (e.g., suppresses or 
10 inhibits) at least one bioactivity of a protein. An antagonist can be a compound which inhibits or decreases the interaction 
between a protein and another molecule, e.g., a target peptide, a ligand or an enzyme substrate. An antagonist can 
also be a compound that downregulates expression of a gene or which reduces the amount of expressed protein 
present. 

[0045] "Small molecule" as used herein, is meant to refer to a composition, which has a molecular weight of less 
15 than about 5 kD and most preferably less than about 4 kD. Small molecules can be nucleic acids, peptides, polypeptides, 
peptidomimetics, carbohydrates, lipids or other organic (carbon-containing) or inorganic molecules. Many pharmaceu- 
tical companies have extensive libraries of chemical and/or biological mixtures, often fungal, bacterial, or algal extracts, 
which can be screened with any of the assays of the invention to identify compounds that modulate a bioactivity. 
[0046] The terms "modulated" or "modulation" or "regulated" or "regulation" and "differentially regulated" as used 
20 herein refer to both upregulation (i.e., activation or stimulation (e.g., by agonizing or potentiating) and down regulation 
[i.e., inhibition or suppression (e.g., by antagonizing, decreasing or inhibiting)]. 

[0047] "Transcriptional regulatory unit" refers to DNA sequences, such as initiation signals, enhancers, and promot- 
ers, which induce or control transcription of protein coding sequences with which they are operably linked. In preferred 
embodiments, transcription of one of the genes is under the control of a promoter sequence (or other transcriptional 
25 regulatory sequence) which controls the expression of the recombinant gene in a cell-type in which expression is 
intended. It will also be understood that the recombinant gene can be under the control of transcriptional regulatory 
sequences which are the same or which are different from those sequences which control transcription of the naturally 
occurring forms of the polypeptide. 

[0048] The term "derivative" refers to the chemical modification of a polypeptide sequence, or a polynucleotide se- 
30 quence. Chemical modifications of a polynucleotide sequence can include, for example, replacement of hydrogen by 
an alkyl, acyl, or amino group. A derivative polynucleotide encodes a polypeptide which retains at least one biological 
or immunological function of the natural molecule. A derivative polypeptide is one modified by glycosylation, pegylation, 
or any similar process that retains at least one biological or immunological function of the polypeptide from which it 
was derived. 

35 [0049] The term "nucleotide analog" refers to oligomers or polymers being at least in one feature different from 
naturally occurring nucleotides, oligonucleotides or polynucleotides, but exhibiting functional features of the respective 
naturally occurring nucleotides (e.g. base paring, hybridization, coding information) and that can be used for said com- 
positions. The nucleotide analogs can consist of non-naturally occurring bases or polymer backbones, examples of 
which are LNAs, PNAs and Morpholinos. The nucleotide analog has at least one molecule different from its naturally 

40 occurring counterpart or equivalent. 

[0050] "BREAST CANCER GENES" or "BREAST CANCER GENE" as used herein refers to the polynucleotides of 
SEQ ID NO: 1 to 26 and 53 to 75, as well as derivatives, fragments, analogs and homologues thereof, the polypeptides 
encoded thereby, the polypeptides of SEQ ID NO: 27 to 52 and 76 to 98 as well as derivatives, fragments, analogs 
and homologues thereof and the corresponding genomic transcription units which can be derived or identified with 

45 standard techniques well known in the art using the information disclosed in Tables 1 to 5 and Figures 1 to 4. The 
GenBank, Locuslink ID and the UniGene accession numbers of the polynucleotide sequences of the SEQ ID NO: 1 to 
26 and 53 to 75 and the polypeptides of the SEQ ID NO: 27 to 52 and 76 to 98 are shown in Table 1 , the gene description, 
gene function and subcellular localization is given in Tables 2 and 3. 

[0051] The term "chromosomal region" as used herein refers to a consecutive DNA stretch on a chromosome which 
50 can be defined by cytogenetic or other genetic markers such as e.g. restriction length polymorphisms (RFLPs), single 
nucleotide polymorphisms (SNPs), expressed sequence tags (ESTs), sequence tagged sites (STSs), micro-satellites, 
variable number of tandem repeats (VNTRs) and genes. Typically a chromosomal region consists of up to 2 Megabases 
(MB), up to 4 MB, up to 6 MB, up to 8. MB, up to 10 MB, up to 20 MB or even more MB. 

[0052] The term "altered chromosomal region" or" abberant chromosomal region" refers to a structural change of 
55 the chromosomal composition and DNA sequence, which can occur by the following events: amplifications, deletions, 
inversions, insertions, translocations and/or viral integrations. A trisomy, where a given cell harbors more than two 
copies of a chromosome, is within the meaning of the term "amplification" of a chromosome or chromosomal region. 
[0053] The present invention provides polynucleotide sequences and proteins encoded thereby, as well as probes 
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derived from the polynucleotide sequences, antibodies directed to the encoded proteins, and predictive, preventive, 
diagnostic, prognostic and therapeutic uses for individuals which are at risk for or which have malignant neoplasia and 
breast cancer in particular. The sequences disclosure herein have been found to be differentially expressed in samples 
from breast cancer 

5 [0054] The present invention is based on the identification of 43 genes that are differentially regulated (up- or down- 
regulated) in tumor biopsies of patients with clinical evidence of breast cancer. The identification of 43 human genes 
which were not known to be differentially regulated in breast cancer states and their significance for the disease is 
described in the working examples herein. The characterization of the co-expression of these genes provides newly 
identified roles in breast cancer. The gene names, the database accession numbers (GenBank and UniGene) as well 

10 as the putative or known functions of the encoded proteins and their subcellular localization are given in Tables 1 to 
4. The primer sequences used for the gene amplification are shown in Table 5. 

[0055] In either situation, detecting expression of these genes in excess or in with lower level as compared to normal 
expression provides the basis for the diagnosis of malignant neoplasia and breast cancer. Furthermore, in testing the 
efficacy of compounds during clinical trials, a decrease in the level of the expression of these genes corresponds to a 

15 return from a disease condition to a normal state, and thereby indicates a positive effect of the compound. 

[0056] Another aspect of the present invention is based on the observation that neighboring genes within defined 
genomic regions functionally interact and influence each others function directly or indirectly. A genomic region encod- 
ing functionally interacting genes that are co-amplified and co-expressed in neoplastic lesions has been defined as an 
"ARCHEON". (ARCHEON - Altered Region of Changed Chromosomal Expression Observed in Neoplasms). Chro- 

20 mosomal alterations often affect more than one gene. This is true for amplifications, duplications, insertions, integra- 
tions, inversions, translocations, and deletions. These changes can have influence on the expression level of single 
or multiple genes. Most commonly in the field of cancer diagnostics and treatment the changes of expression levels 
have been investigated for single, putative relevant target genes such as MLVI2 (5p14), NRASL3 (6p12), EGFR (7p12), 
c-myc(8q23), Cyclin D1 (11q13), IGF1R(15q25), HER-2/neu (17q12), PCNA(20q12). However, the altered expression 

25 level and interaction of multiple (i.e. more than two) genes within one genomic region with each other has not been 
addressed. Genes of an ARCHEON form gene clusters with tissue specific expression patterns. The mode of interaction 
of individual genes within such a gene cluster suspected to represent an ARCHEON can be either protein-protein or 
protein-nucleic acid interaction, which may be illustrated but not limited by the following examples: ARCHEON gene 
interaction may be in the same signal transduction pathway, may be receptor to ligand binding, receptor kinase and 

30 SH2 or SH3 binding, transcription factor to promoter binding, nuclear hormone receptor to transcription factor binding, 
phosphogroup donation (e.g. kinases) and acceptance (e.g. phosphoprotein), mRNA stabilizing protein binding and 
transcriptional processes. The individual activity and specificity of a pair genes and or the proteins encoded thereby 
or of a group of such in a higher order, may be readily deduced from literature, published or deposited within public 
databases by the skilled person. However in the context of an ARCHEON the interaction of members being part of an 

35 ARCHEON will potentiate, exaggerate or reduce their singular functions. This interaction is of importance in defined 
normal tissues in which they are normally co-expressed. Therefore, these clusters have been commonly conserved 
during evolution. The aberrant expression of members of these ARCHEON in neoplastic lesions, however, (especially 
within tissues in which they are normally not expressed) has influence on tumor characteristics such as growth, inva- 
siveness and drug responsiveness. Due to the interaction of these neighboring genes it is of importance to determine 

40 the members of the ARCHEON which are involved in the deregulation events. In this regard amplification and deletion 
events in neoplastic lesions are of special interest. 

[0057] The invention relates to a method for the detection of chromosomal alterations by (a) determining the relative 
mRNA abundance of individual mRNA species or (b) determining the copy number of one or more chromosomal region 
(s) by quantitative PCR. In one embodiment information on the genomic organization and spatial regulation of chro- 
45 mosomal regions is assessed by bioinformatic analysis of the sequence information of the human genome (UCSC, 
NCBI) and then combined with RNA expression data from GeneChip™ DNA-Arrays (AfFymetrix) and/or quantitative 
PCR (TaqMan) from RNA-samples or genomic DNA. 

[0058] In a further embodiment the functional relationship of genes located on a chromosomal region which is altered 
(amplified or deleted) is established. The altered chromosomal region is defined as an ARCHEON if genes located on 

50 that region functionally interact. 

[0059] The 17q 12 locus was investigated as one model system, harboring the HER-2/neu gene. By establishing a 
high-resolution assay to detect amplification events in neighboring genes, 43 genes that are commonly co-amplified 
in breast cancer cell lines and patient samples were identified. By gene array technologies and immunological methods 
their co-overexpression in tumor samples was demonstrated. Surprisingly, by clustering tissue samples with HER- 

55 2/neu positive Tumor samples, it was found that the expression pattern of this larger genomic region (consisting of 43 
genes) is very similar to control brain tissue. HER-2/neu negative breast tumor tissue did not show a similar expression 
pattern. Indeed, some of the genes within these cluster are important for neural development (HER-2/neu, THRA) in 
mouse model systems or are described to be expressed in neural cells (NeuroD2). Moreover, by searching similar 
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gene combinations in the human and rodent genome additional homologous chromosomal regions on chromosome 
3p21 and 12q13 harboring several isoforms of the respective genes (see below) were found. There was a strong 
evidence for multiple interactions between the 43 candidate genes, as being part of identical pathways (HER-2. neu, 
GRB7, CrkRS, CDC6), influencing the expression of each other (HER-2/neu, THRA, RARA), interacting with each 
5 other (PPARGBP, THRA, RARA, NR1 D1 or HER-2/neu, GRB7) or expressed in defined tissues (CACNB1 , PPARGBP, 
etc.). Interestingly, the genomic regions of the ARCHEONs that were identified are amplified in acquired Tamoxifen 
resistance of HER-2/neu negative cells (MCF7), which are normally sensitive to Tamoxifen treatment [Achuthan et a!., 
2001.(2)]. 

[0060] Moreover, altered responsiveness to treatment due to the alterations of the genes within these ARCHEONs 

10 was observed. Surprisingly, genes within the ARCHEONs are of importance even in the absence of HER-2/neu homo- 
logues. Some of the genes within the ARCHEONs, do not only serve as marker genes for prognostic purposes, but 
have already been known as targets for therapeutic intervention. For example TOP2 alpha is a target of anthracyclins. 
THRA and RARA can be targeted by hormones and hormone analogs (e.g. T3, rT3, RA). Due to their high affinity 
binding sites and available screening assays (reporter assays based on their transcriptional potential) the hormone 

15 receptors which are shown to be linked to neoplastic pathophysiology for the first time herein are ideal targets for drug 
screening and treatment of malignant neoplasia and breast cancer in particular. In this regard it is essential to know 
which members of the ARCHEON are altered in the neoplastic lesions. Particularly it is important to know the nature, 
number and extent to which the ARCHEON genes are amplified or deleted. The ARCHEONs are flanked by similar, 
endogenous retroviruses (e.g. HERV-K= "human endogenous retrovirus"), some of which are activated in breast can- 

20 cer. These viruses may have also been involved in the evolutionary duplication of the ARCHEONs. 

[0061] The analysis of the 17q12 region proved data obtained by IHC and identified several additional genes being 
co-amplified with the HER-2/neu gene. Comparative Analysis of RNA-based quantitative RT-PCR (TaqMan) with DNA- 
based qPCR from tumor cell lines identified the same amplified region. Genes at the 17q11.2 -21. region are offered 
by way of illustration not by way of limitation. A graphical display of the described chromosomal region is prqvided in 

25 Figure 1 . 

Biological relevance of the genes which are part of the 17q12 ARCHEON 
MLN50 

30 

[0062] By differential screening of cDNAs from breast cancer-derived metastatic axillary lymph nodes, TRAF4 and 
3 other novel genes (MLN51, MLN62, MLN64) were identified that are overexpressed in breast cancer [Tomasetto et 
al. t 1995, (3)]. One gene, which they designated MLN50, was mapped to 17q1 1-q21.3 by radioactive in situ hybridi- 
zation. In breast cancer cell lines, overexpression of the 4 kb MLN50 mRNA was correlated with amplification of the 

35 gene and with amplification and overexpression of ERBB2, which maps to the same region. The authors suggested 
that the 2 genes belong to the same amplicon. Amplification of chromosomal region 17q11-q21 is one of the most 
common events occurring in human breast cancers. They reported that the predicted 261-amino acid MLN 50 protein 
contains an N-terminal LIM domain and a C-terminal SH3 domain. They renamed the protein LASP1 , for 'LIM and SH3 
protein.' Northern blot analysis revealed that LASP1 mRNA was expressed at a basal level in all normal tissues ex- 

40 amined and overexpressed in 8% of primary breast cancers. In most of these cancers, LASP1 and ERBB2 were si- 
multaneously overexpressed. 

MLLT6 

45 [0063] The MLLT6 (AF17) gene encodes a protein of 1,093 amino acids, containing a leucine-zipper dimerization 
motif located 3-prime of the fusion point and a cysteine-rich domain at the end terminus. AF17 was found to contain 
stretches of amino acids previously associated with domains involved in transcriptional repression or activation. 
[0064] Chromosome translocations involving band 1 1q23 are associated with approximately 10% of patients with 
acute lymphoblastic leukemia (ALL) and more than 5% of patients with acute myeloid leukemia (AML). The gene at 

50 Hq23 involved in the translocations is variously designated ALL1 , HRX, MLL, and TAX1 . The partner gene in one of 
the rarer translocations, t(11;17)(q23;q21), designated MLLT6 on 17q12. 

ZNF144 (MehS) 

55 [0065] Mel18 cDNA encodes a novel cys-rich zinc finger motif. The gene is expressed strongly in most tumor cell 
lines, but its normal tissue expression was limited to cells of neural origin and was especially abundant in fetal neural 
cells. It belongs to a RING-finger motif family which includes BMI1. The MEL18/BMI1 gene family represents a mam- 
malian homolog of the Drosophila 'polycomc-' gene group, thereby belonging to a memory mechanism involved in 
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maintaining the the expression pattern of key regulatory factors such as Hox genes. Bmi1, Mel18 and M33 genes, as 
representative examples of mouse Pc-G genes. Common phenotypes observed in knockout mice mutant for each of 
these genes indicate an important role for Pc-G genes not only in regulation of Hox gene expression and axial skeleton 
development but also in control of proliferation and survival of haematopoietic cell lineages. This is in line with the 

5 observed proliferative deregulation observed in lymphoblastic leukemia. The MEL18 gene is conserved among verte- 
brates. Its mRNA is expressed at high levels in placenta, lung, and kidney, and at lower levels in liver, pancreas, and 
skeletal muscle. Interestingly, cervical and lumbo-sacral-HOX gene expression is altered in several primary breast 
cancers with respect to normal breast tissue with the HoxB gene cluster being present on 17q distal to the 17q1 2 locus. 
Moreover, delay of differentiation with persistent nests of proliferating cells was found in endothelial cells cocultured 

10 with HOXB7-transduced SkBr3 cells, which exhibit a 17q1 2 amplification. Tumorigenicity of these cells has been eval- 
uated in vivo. Xenograft in athymic nude mice showed that SkBr3/HOXB7 cells developed tumors with an increased 
number of blood vessels, either irradiated or not, whereas parental SkBr3 cells did not show any tumor take unless 
mice were sublethally irradiated. As part of this invention, we have found MEL18 to be overexpressed specifically in 
tumors bearing Her-2/neu gene amplification, which can be critical for Hox expression. 

PHOSPHATIDYLINOSITOL-4-PHOSPHATE 5-KtNASE, TYPE II, BETA; PIP5K2B 

[0066] Phosphoinositide kinases play central roles in signal transduction. Phosphatidylinositol-4-phosphate 5-kinas- 
es (PIP5Ks) phosphorylate phosphatidylinositol 4-phosphate, giving rise to phosphatidylinositol 4,5-bisphosphate. The 

20 PIP5K enzymes exist as multiple isoforms that have various immunoreactivities, kinetic properties, and molecular 
masses. They are unique in that they possess almost no homology to the kinase motifs present in other phosphatidyli- 
nositol, protein,. and lipid kinases. By screening a human fetal brain cDNA library with the PIP5K2B EST the full length 
gene could be isolated. The deduced 416-amino acid protein is 78% identical to PIP5K2A. Using SDS-PAGE, the 
authors estimated that bacterially expressed PIP5K2B has a molecular mass of 47 kD. Northern blot analysis detected 

25 a 6.3-kb PIP5K2B transcript which was abundantly expressed in several human tissues. PIP5K2B interacts specifically 
with the juxtamembrane region of the p55 TNF receptor (TNFR1) and PIP5K2B activity is increased in mammalian 
cells by treatment with TNF-alpha. A modeled complex with membrane-bound substrate and ATP shows how a phos- 
phoinositide kinase can phosphorylate its substrate in situ at the membrane interface.The substrate-binding site is 
open on 1 side, consistent With dual specificity for phosphatidylinositol 3- and 5-phosphates. Although the amino acid 

30 sequence of PIP5K2A does not show homology to known kinases, recombinant PIP5K2A exhibited kinase activity. 
PIP5K2A contains a putative Src homology 3 (SH3) domain-binding sequence. Overexpression of mouse PIP5K1B in 
COS7 cells induced an increase in short actin fibers and a decrease in actin stress fibers. 

TEM7 

35 

[0067] Using serial analysis of gene expression (SAGE) a partial cDNAs corresponding to several tumor endothelial 
markers (TEMs) that displayed elevated expression during tumor angiogenesis could be identified. Among the genes 
identified was TEM7. Using database searches and 5-prime RACE the entire TEM7 coding region, which encodes a 
. 500-amino acid type I transmembrane protein.has been described.. The extracellular region of TEM7 contains a plexin- 

40 like domain and has weak homology to the ECM protein nidogen. The function of these domains, which are usually 
found in secreted and extracellular matrix molecules, is unknown. Nidogen itself belongs to the entactin protein family 
and helps to determine pathways of migrating axons by switching from circumferential to longitudinal migration. Entactin 
is involved in cell migration, as it promotes trophoblast outgrowth through a mechanism mediated by the RGD recog- 
nition site, and plays an important role during invasion of the endometrial basement membrane at implantation. As 

45 entactin promotes thymocyte adhesion but affects thymocyte migration only marginally, it is suggested that entactin 
may plays a role in thymocyte localization during T cell development. 

[0068] In situ hybridization analysis of human colorectal cancer demonstrated that TEM7 was expressed clearly in 
- the endothelial cells of the tumor stroma but not in the endothelial cells of normal colonic tissue. Using in situ hybridi- 
zation to assay expression in various normal adult mouse tissues, they observed that TEM7 was largely undetectable 
. so jn mouse tissues or tumors, but was abundantly expressed in mouse brain. 

ZNFN1A3 

[0069] By screening a B-cell cDNA library with a mouse Aiolos N-terminal cDNA probe, a cDNA encoding human 
55 .Aiolos, or ZNFN1A3, was obtained. The deduced 509-amino acid protein, which is 86% identical to its mouse coun- 
terpart, has 4 DNA-bihding zinc fingers in its N terminus and 2 zinc fingers that mediate protein dimerization in its C 
terminus. These domains are 100% and 96% homologous to the corresponding domains in the mouse protein, respec- 
tively. Northern blot analysis revealed strong expression of a major 11.0- and a minor 4.4-kb ZNFN1A3 transcript in 
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peripheral blood leukocytes, spleen, and thymus, with lower expression in liver, small intestine, and lung. 
[0070] Ikaros (ZNFN1A1), a hemopoietic zinc finger DNA-binding protein, is a central regulator of lymphoid differ- 
entiation and is implicated in leukemogenesis. The execution of normal function of Ikaros requires sequence-specific . 
DNA binding, transactivation, and dimerization domains. Mice with a mutation in a related zinc finger protein, Aiolos, 

5 are prone to B-cell lymphoma. In chemically induced murine lymphomas allelic losses on markers surrounding the 
Znmial gene were detected in 27% of the tumors analyzed. Moreover specific Ikaros expression was in primary mouse 
hormone-producing anterior pituitary cells and substantial for Fibroblast growth factor receptor 4 (FGFR4) expression, 
which itself is implicated in a multitude of endocrine cell hormonal and proliferative properties with FGFR4 being dif- 
ferentially expressed in normal and neoplastic pituitary. Moreover Ikaros binds to chromatin remodelling complexes 

10 containing SWI/SNF proteins, which antagonize Polycomb function. Intetrestingly at the telomeric end of the disclosed 
ARCHEON the SWI/SNF complex member SMARCE1 (= SWI/SNF-related, matrix-associated, actin-dependent reg- 
ulators of chromatin) is located and part of the described amplification. Due to the related binding specificities of Ikaros 
and Palindrom Binding Protein (PBP) it is suggestive, that ZNFN1A3 is able to regulate the Her-2/neu enhancer. 

15 PPP1R1B 

[0071] Midbrain dopaminergic neurons play a critical role in multiple brain functions, and abnormal signaling through 
dopaminergic pathways has been implicated in several major neurologic and psychiatric disorders. One well-studied 
target for the actions of dopamine is DARPP32. In the densely dopamine- and glutamate-innervated rat caudate- 
20 putamen, DARPP32 is expressed in medium-sized spiny neurons that also express dopamine D1 receptors: The func- 
tion of DARPP32 seems to be regulated by receptor stimulation. Both dopaminergic and glutamatergic (NMDA) receptor 
stimulation regulate the extent of DARPP32 phosphorylation, but in opposite directions. 

[0072] The human DARPP32 was isolated from a striatal cDN A library. The 204-amino acid DARPP32 protein shares 
88% and 85% sequence identity, respectively, with bovine and rat DARPP32 proteins. The DARPP32 sequence is 

25 particularly conserved through the N terminus, which represents the active portion of the protein. Northern blot analysis 
demonstrated that the 2.1-kb DARPP32 mRNA is more highly expressed in human caudate than in cortex. In situ 
hybridization to postmortem human brain showed a low level of DARPP32 expression in all neocortical layers, with 
the strongest hybridization in the superficial layers. CDK5 phosphorylated DARPP32 in vitro and in intact brain cells. 
Phospho-thr75 DARPP32 inhibits PKA in vitro by a competitive mechanism. Decreasing phospho-thr75 DARPP32 in 

30 striatal cells either by a CDK5-specific inhibitor or by using genetically altered mice resulted in increased dopamine- 
induced phosphorylation of PKA substrates and augmented peak voltage-gated calcium currents. Thus, DARPP32 is 
a Afunctional signal transduction molecule which, by distinct mechanisms, controls a serine/threonine kinase and a 
serine/threonine phosphatase. 

[0073] DARPP32 and t-DARPP are overexpressed in gastric cancers. It's suggested that overexpression of these 2 
35 proteins in gastric cancers may provide an important survival advantage to neoplastic cells. It could be demonstrated 
that Darpp32 is an obligate intermediate in progesterone-facilitated sexual receptivity in female rats and mice. The 
facilitative effect of progesterone on sexual receptivity in female rats was blocked by antisense oligonucleotides to 
Darpp32. Homozygous mice carrying a null mutation for the Darpp32 gene exhibited minimal levels of progesterone- 
facilitated sexual receptivity when compared to their wildtype littermates, and progesterone significantly increased 
*o hypothalamic cAMP levels and cAMP-dependent protein kinase activity. 

CACNB1 

[0074] In 1 991 a cDNA clone encoding a protein with high homology to the beta subunit of the rabbit skeletal muscle 
45 dihydropyridine-sensitive calcium channel from a rat brain cDNA library [Pragnell et al. f 1991 , (4)]. This rat brain beta- 
subunit cDNA hybridized to a 3.4-kb message that was expressed in high levels in the cerebral hemispheres and 
hippocampus and much lower levels in cerebellum. The open reading frame encodes 597 amino acids with a predicted 
mass of 65,679 Da which is 82% homologous with the skeletal muscle beta subunit. The corresponding human beta- 
subunit gene was localized to chromosome 17 by analysis of somatic cell hybrids. The authors suggested that the 
so encoded brain beta subunit, which has a primary structure highly similar to its isoform in skeletal muscle, may have a 
comparable role as an integral regulatory component of a neuronal calcium channel. 

RPL19 

55 [0075] The ribosome is the only organelle conserved between prokaryotes and eukaryotes. In eukaryotes, this or- 
ganelle consists of a 60S large subunit and a 40S small subunit. The mammalian ribosome contains 4 species of RNA 
and approximately 80 different ribosomal proteins, most of which appear to be present in equimolar amounts. In mam- 
malian cells, ribosomal proteins can account for up to 1 5% of the total cellular protein, and the expression of the different 
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ribosomal protein genes, which can account for up to 7 to 9% of the total cellular mRNAs, is coordinate^ regulated to 
meet the cell's varying requirements for protein synthesis. The mammalian ribosomal protein genes are members of 
multigene families, most of which are composed of multiple processed pseudogenes and a single functional introh- 
containing gene. The presence of multiple pseudogenes hampered the isolation and study of the functional ribosomal 

5 protein genes. By study of somatic cell hybrids, it has been elucidated that DNA sequences complementary to 6 mam- 
malian ribosomal protein cDNAs could be assigned to chromosomes 5, 8, and 17. Ten fragments mapped to 3 chro- 
mosomes [Nakamichi et al., 1 986, (5)]. These are probably a mixture of functional (expressed) genes and pseudogenes. 
One that maps to 5q23-q33 rescues Chinese hamster emetine-resistance mutations in interspecies hybrids and is 
therefore the transcriptionally active RPS14 gene. In 1989 a PCR-based strategy for the detection of intron-containing 

10 genes in the presence of multiple pseudogenes was described. This technique was used to identify the intron-containing 
PCR products of 7 human ribosomal protein genes and to map their chromosomal locations by hybridization to human/ 
rodent somatic cell hybrids [Feo et al., 1992, (6)]. All 7 ribosomal protein genes were found to be on different chromo- 
somes: RPL19 on 17pl2-qll;RPL30 on 8; RPL35A on 18; RPL36A on 14; RPS6 on 9pter-p13; RPS11 on 19cen-qter, 
and RPS17 on 11pter-p13. These are also different sites from the chromosomal location of previously mapped ribos- 

*5 omal protein genes S14 on chromosome 5, S4 on Xq and Yp, and RP117A on 9q3-q34. By fluorescence in situ hy- 
bridization the position of the RPL19 gene was mapped to 17q11 [Davies et al., 1989, (7)]. 

PPARBP, PBP, CRSP1, CRSP200, TRIP2, TRAP220, RB18A, DRIP230 

20 [0076] The thyroid hormone receptors (TRs) are hormone-dependent transcription factors that regulate expression 
of a variety of specific target genes. They must specifically interact with a number of proteins as they progress from 
their initial translation and nuclear translocation to heterodimerization with retinoid X receptors (RXRs), functional in- 
teractions with other transcription factors and the basic transcriptional apparatus, and eventually, degradation. To help 
elucidate the mechanisms that underlie the transcriptional effects and other potential functions of TRs, the yeast inter- 
ns action trap, a version of the yeast 2-hybrid system, was used to identify proteins that specifically interact with the ligand- 
binding domain of rat TR-beta-1 (THRB) [Lee et al., 1 995, (8)J. The authors isolated HeLa cell cDNAs encoding several 
different TR-interacting proteins (TRIPs), including TRIP2. TRIP2 interacted with rat Thrb only in the presence of thyroid 
hormone. It showed a ligand-independent interaction with RXR-alpha, but did not interact with the glucocorticoid re- 
ceptor (NR3C1) under any condition. By immunoscreening a human B-lymphoma cell cDNA expression library with 

30 the anti-p53 monoclonal antibody PAM801, PPARBP was identified, which was called RB18A for 'recognized by 
PAb1801 monoclonal antibody' [Drane et al., 1997, (9)]. The predicted 1,566-amino acid RB 18A protein contains 
several potential nuclear localization signals, 13 potential N-glycosylation sites, and a high number of potential phos- 
phorylation sites. Despite sharing common antigenic determinants with p53, RB18A does not show significant nucle- 
otide or amino acid sequence similarity with p53. Whereas the calculated molecular mass of RB18A is 166 kD, the 

35 apparent mass of recombinant RB18A was 205 kD by SDS-PAGE analysis. The authors demonstrated that RB18A 
shares functional properties with p53, including DNA binding, p53 binding, and self-oligomerization. Furthermore, 
RB18A was able to activate the sequence-specific binding of p53 to DNA, which was induced through an unstable 
interaction between both proteins. Northern blot analysis of human tissues detected an 8.5-kb RB18A transcript in all 
tissues examined except kidney, with highest expression in heart. Moreover mouse Pparbp, which was called Pbp for 

4 o 'Ppar-binding protein,' as a protein that interacts with the Ppar-gamma (PPARG) ligand-binding domain in a yeast 
2-hybrid system was identified [Zhu et al., 1997, (10)]. The authors found that Pbp also binds to PPAR-alpha (PPARA), 
RAR-alpha (RARA), RXR, and TR-beta-1 in vitro. The binding of Pbp to these receptors increased in the presence of 
specific ligands. Deletion of the last 12 amino acids from the C terminus of PPAR-gamma resulted in the abolition of 
interaction between Pbp and PPAR-gamma. Pbp modestly increased the transcriptional activity of PPAR-gamma, and 

45 a truncated form of Pbp acted as a dominant-negative repressor, suggesting that Pbp is a genuine transcriptional co- 
activator for PPAR. The predicted 1,560-amino acid Pbp protein contains 2 LXXLL motifs, which are considered nec- 
essary and sufficient for the binding of several co-activators to nuclear receptors. Northern blot analysis detected Pbp 
expression in all mouse tissues examined, with higher levels in liver, kidney, lung, and testis. In situ hybridization 
showed that Pbp is expressed during mouse ontogeny, suggesting a possible role for Pbp in cellular proliferation and 

so differentiation. In adult mouse, in situ hybridization detected Pbp expression in liver, bronchial epithelium in the lung, 
intestinal mucosa, kidney cortex, thymic cortex, splenic follicles, and seminiferous epithelium in testis. Lateron PPARBP 
was identified, which was called TRAP220, from an immunopurified TR-a!pha (THRA)-TRAP complex [Yuan et al., 
1998, (11)]. The authors cloned Jurkat cell cDNAs encoding TRAP22I0. The predicted 1,581-amino acid TRAP220 
protein contains LXXLL domains, which are found in other nuclear receptor-interacting proteins. TRAP220 is nearly 

55 identical to RB18A f with these proteins differing primarily by an extended N terminus on TRAP220. In the absence of 
TR-alpha, TRAP220 appears to reside in a single complex with other TRAPs. TRAP220 showed a direct ligand-de- 
pendent interaction with TR-alpha, which was mediated through the C terminus of TR-alpha and, at least in part, the 
LXXLL domains of TRAP220. TRAP220 also interacted with other nuclear receptors, including vitamin D receptor, 
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RARA, RXRA, PPARA, PPARG, and estrogen receptor-alpha (ESR1; 133430), in a ligand-dependent manner. 
TRAP220 moderately stimulated human TR-alpha-mediated transcription in transfected cells, whereas a fragment 
containing the LXXLL motifs acted as a dominant-negative inhibitor of nuclear receptor-mediated transcription both in 
transfected cells and in cell-free transcription systems. Further studies indicated that TRAP220 plays a major role in 

5 anchoring other TRAPs to TR-alpha during the function of the TR-alpha-TRAP complex and that TRAP220 may be a 
global co-activator for the nuclear receptor superfamily. PBP, a nuclear receptor co-activator, interacts with estrogen 
receptor-alpha (ESR1) in the absence of estrogen. This interaction was. enhanced in the presence of estrogen, but 
was reduced in the presence of the anti-estrogen Tamoxifen. Transfection of PBP into cultured cells resulted in en- 
hancement of estrogen-dependent transcription, indicating that PBP serves as a co-activator in estrogen receptor 

10 signaling. To examine whether overexpression of PBP plays a role in breast cancer because of its co-activator function 
in estrogen receptor signaling, the levels of PBP expression in breast tumors was determined [Zhu et al., 1999, (12)]. 
High levels of PBP expression were detected in approximately 50% of primary breast cancers and breast cancer cell 
lines by ribonuclease protection analysis, in situ hybridization, and immunoperoxidase staining. By using FISH, the 
authors mapped the PBP gene to 17q12, a region that is amplified in some breast cancers. They found PBP gene 

*5 amplification in approximately 24% (6 of 25) of breast tumors and approximately 30% (2 of 6) of breast cancer cell 
lines, implying that PBP gene overexpression can occur independent of gene amplification. They determined that the 
PBP gene comprises 1 7 exons that together span more than 37 kb. Their findings, in particular PBP gene amplification, 
suggested that PBP, by its ability to function as an estrogen receptor-alpha co-activator, may play a role in mammary 
epithelial differentiation and in breast carcinogenesis. 

20 

NEUROD2 

[0077] Basic helix-loop-helix (bHLH) proteins are transcription factors involved in determining cell type during devel- 
opment. In 1995 a bHLH protein was described, termed NeuroD (for 'neurogenic differentiation'), that functions during 

25 neurogenesis [Lee et al., 1995, (13)]. The human NEUROD gene maps to chromosome 2q32. The cloning and char- 
acterization of 2 additional NEUROD genes, NEUROD2 and NEUROD3 was described in 1996 [McCormick et al., 
1996, (14)]. Sequences for the mouse and human homologues were presented. NEUROD2 shows a high degree of 
homology to the bHLH region of NEUROD, whereas NEUROD3 is more distantly related. The authors found that mouse 
neuroD2 was initially expressed at embryonic day 11, with persistent expression in the adult nervous system. Similar 

30 to neuroD, neuroD2 appears to mediate neuronal differentiation. The human NEUROD2 was mapped to 17q12 by 
fluorescence in situ hybridization and the mouse homologue to chromosome 11 [Tamimi e t al., 1997, (15)]. 

TELETHONIN 

35 [0078] Telethonin is a sarcomeric protein of 19 kD found exclusively in striated and cardiac muscle It appears to be 
localized to the Z disc of adult skeletal muscle and cultured myocytes. Telethonin is a substrate of titin, which acts as 
a molecular 'ruler* for the assembly of the sarcomere by providing spatially defined binding sites for other sarcomeric 
proteins. After activation by phosphorylation and calcium/calmodulin binding, titin phosphorylates the C-terminal do- 
main of telethonin in early differentiating myocytes. The telethonin gene has been mapped to 17q12, adjacent to the 

40 phenylethanolamine N-methyltransferase gene [Valle et al., 1997, (16)]. 

PENT, PNMT 

[0079] Phenylethanolamine N-methyltransferase catalyzes the synthesis of epinephrine from norepinephrine, the 
45 last step of catecholamine biosynthesis. The cDNA clone was first isolated in 1 998 for bovine adrenal medulla PNMT 
using mixed oligodeoxyribonucleotide probes whose synthesis was based on the partial amino acid sequence of tryptic 
peptides from the bovine enzyme [Kaneda et al., 1988, (17)]. Using a bovine cDNA as a probe, the authors screened 
a human pheochromocytoma cDNA library and isolated a cDNA clone with an insert of about 1 .0 kb, which contained 
a complete coding region of the enzyme. Northern blot analysis of human pheochromocytoma polyadenylated RNA 
50 using this cDNA insert as the probe demonstrated a single RNA species of about 1,000 nucleotides, suggesting that 
this clone is a full-length cDNA. The nucleotide sequence showed that human PNMT has 282 amino acid residues 
with a predicted molecular weight of 30,853, including the initial methionine. The amino acid sequence was 88% ho- 
mologous to that of bovine enzyme. The PNMT gene was found to consist of 3 exons and 2 introns spanning'about 
2,100 basepairs. It was demonstrated that in transgenic mice the gene is expressed in adrenal medulla and retina. A 
55 hybrid gene consisting of 2 kb of the PNMT 5-prime-flanking region fused to the simian virus 40 early region also 
resulted in tumor antigen mRNA expression in adrenal glands and eyes; furthermore, immunocytochemistry showed 
that the tumor antigen was localized in nuclei of adrenal medullary cells and cells of the inner nuclear cell layer of the 
retina, both prominent sites of epinephrine synthesis. The results indicate that the enhancer(s) for appropriate expres- 
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sion of the gene in (hese cell types are in the 2-kb 5-prime-flanking region of the gene. 

[0080] Kaneda et al., 1988 (17), assigned the human PNMT gene to chromosome 17 by Southern blot analysis of 
DNA from mouse-human somatic cell hybrids. In 1992 the localization was narrowed down to 17q21-q22 by linkage 
analysis using RFLPs related to the PNMT gene and several 17q DNA markers [Hoehe et al., 1992, (18)]. The findings 
5 are of interest in light of the description of a genetic locus associated with blood pressure regulation in the stroke-prone 
spontaneously hypertensive rat (SHR-SP) on rat chromosome 10 in a conserved linkage synteny group corresponding 
to human chromosome 17q22-q24. See essential hypertension . 

MGC9753 

10 

[0081] This gene maps on chromosome 17, at 17q12 according to RefSeq. It is expressed at very high level. It is 
defined by cDNA clones and produces, by alternative splicing, 7 different transcripts can be obtained (SEQ ID NO:60 
to 66 and 83 to 89 .Table 1), altogether encoding 7 different protein isoforms. Of specific interest is the putatively 
secreted isoform g, encoded by a mRNA of 2.55 kb. Jt's premessenger covers 16.94 kb on the genome. It has a very 
15 long 3' UTR. . The protein (226 aa, MW 24.6 kDa, pi 8.5) contains no Pfam motif. The MGC9753 gene produces, by 
alternative splicing, 7 types of transcripts, predicted to encode 7 distinct proteins. It contains 13 confirmed introns, 10 
of which are alternative. Comparison to the genome sequence shows that 11 introns follow the consensual [gt-ag] rule, 
1 is atypical with good support [tg_cg]. The six most abundant isoforms are designated by a) to i) and code for proteins 
as follows: 

20 

a) This mRNA is 3.03 kb long, its premessenger covers 16.95 kb on the genome. It has a very long 3' UTR. the 
protein (1 90 aa, MW 21 .5 kDa, pi 7.2) contains no Pfam motif. It is predicted to localise in the endoplasmic reticulum. 

c) This mRNA is 1.17 kb long, its premessenger covers 16.93 kb on the genome. It may be incomplete at the N 
25 terminus. The protein (368 aa, MW 41 .5 kDa, pi 7.3) contains no Pfam motif. 

d) This mRNA is 3.17 kb long, its premessenger covers 16.94 kb on the genome. It has a very long 3* UTR and 
5'p UTR. . The protein (190 aa, MW 21.5 kDa, pi 7.2) contains no Pfam motif. It is predicted to localise in the 
endoplasmic reticulum. 

30 . 

g) This mRNA is 2.55 kb long, its premessenger covers 16.94 kb on the genome. It has a very long 3' UTR. . The 
protein (226 aa, MW 24.6 kDa, pi 8.5) contains no Pfam motif. It is predicted to be secreted. 

h) This mRNA is 2.68 kb long, its premessenger covers 16.94 kb on the genome. It has a very long 3' UTR. . The 
35 protein (320 aa, MW 36.5 kDa, pi 6.8) contains no Pfam motif. It is predicted to localise in the endoplasmic reticulum. 

i) This mRNA is 2.34 kb long, its premessenger covers 16.94 kb on the genome. It may be incomplete at the N 
terminus. It has a very long 3' UTR. . The protein (217 aa, MW 24.4 kDa, pi 5.9) contains no Pfam motif. 

4.0 [0082] The MCG9753 gene may be homologue to the CAB2 gene located on chromosome 17q12. The CAB2, a 
human homologue of the yeast COS 16 required for the repair of DNA double-strand breaks was cloned. Autofluores- 
cence analysis of cells transfected with its GFP fusion protein demonstrated that CAB2 translocates into vesicles, 
suggesting that overexpression of CAB2 may decrease intercellular Mn-(2+) by accumulating it in the vesicles, in the 
same way as yeast. 

45 

Her-2fneu, ERBB2, NGL, TKR1 

[0083] The oncogene originally called NEU was derived from rat neuro/glioblastoma cell lines. It encodes a tumor 
antigen, p185, which is serologically related to EGFR, the epidermal growth factor receptor. EGFR maps to chromo- 

50 some 7. In1985 it was found, that the human homologue, which they designated NGL (to avoid confusion with neu- 
raminidase, which is also symbolized NEU), maps to 17q12-q22 by in situ hybridization and to 17q21-qter in somatic 
cell hybrids [Yang-Feng et al., 1985, (19)]. Thus, the SRO is 17q21-q22. Moreover, in 1985 a potential cell surface 
receptor of the tyrosine kinase gene family was identified and characterized by cloning the gene [Coussens et al., 
1 985, (20)]. Its primary sequence is very similar to that of the human epidermal growth factor receptor. Because of the 

55 seemingly close relationship to the human EGF receptor, the authors called the gene HER2. By Southern blot analysis 
of somatic cell hybrid DNA and by in situ hybridization, the gene was assigned to 17q21-q22. This chromosomal location 
of the gene is coincident with the NEU oncogene, which suggests that the 2 genes may in fact be the same; indeed, 
sequencing indicates that they are identical. In1988 a correlation between overexpression of NEU protein and the 
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large-cell, comedo growth type of ductal carcinoma was found [van de Vijver et al., 1988, (21 )J. The authors found no 
correlation, however, with lymph-node status or tumor recurrence. The role of HER2/NEU in breast and ovarian cancer 
was described in 1989, which together account for one-third of all cancers in women and approximately one-quarter 
of cancer-related deaths in females [Slamon et al. f 1989, (22)]. 

5 [0084] An ERBB-related gene that is distinct from the ERBB gene, called ERBB1 was found in 1985. ERBB2 was 
not amplified in vulva carcinoma cells with EGFR amplification and did not react with EGF receptor mRNA. About 
30-fold amplification of ERBB2 was observed in a human adenocarcinoma of the salivary gland. By chromosome 
sorting combined with velocity sedimentation and Southern hybridization, the ERBB2 gene was assigned to chromo- 
some 17 [Fukushige et al., 1986, (23)]. By hybridization to sorted chromosomes and to metaphase spreads with a 

10 genomic probe, they mapped the ERBB2 locus to 1 7q21 . This is the chromosome 1 7 breakpoint in acute promyelocytic 
leukemia (APL). Furthermore, they observed amplification and elevated expression of the ERBB2 gene in a gastric 
cancer cell line. Antibodies against a synthetic peptide corresponding to 14 amino acid residues at the COOH-terminus 
of a protein deduced from the ERBB2 nucleotide sequence were raised in 1986. With these antibodies, the ERBB2 
gene product from adenocarcinoma cells was precipitated and demonstrated to be a 185-kD glycoprotein with tyrosine 

*5 kinase activity. A cDN A probe for ERBB2 and by in situ hybridization to APL cells with a 1 5; 1 7 chromosome translocation 
located the gene to the proximal side of the breakpoint [Kaneko et al., 1987, (24)]. The authors suggested that both 
the gene and the breakpoint are located in band 17q21.1 and, further, that the ERBB2 gene is involved in the devel- 
opment of leukemia. In 1987 experiments indicated that NEU and HER2 are both the same as ERBB2 [Di Fiore et al., 
1987, (25)]. The authors demonstrated that overexpression alone can convert the gene for a normal growth factor 

20 receptor, namely, ERBB2, into an oncogene. The ERBB2 to 17q11-q21 by in situ hybridization [Popescu et al., 1989, 
(26)]. By in situ hybridization to chromosomes derived from fibroblasts carrying a constitutional translocation between 
15 and 17, they showed that the ERBB2 gene was relocated to the derivative chromosome 15; the gene can thus be 
- localized to 17q12-q21 .32. By family linkage studies using multiple DNA markers in the 17q12-q21 region the ERBB2 
gene was placed on the genetic map of the region, 

25 [0085] lnterteukin-6 is a cytokine that was initially recognized as a regulator of immune and inflammatory responses, 
but also regulates the growth of many tumor cells, including prostate cancer. Overexpression of ERBB2 and ERBB3 
has been implicated in the neoplastic transformation of prostate cancer. Treatment of a prostate cancer cell line with 
IL6 induced tyrosine phosphorylation of ERBB2 and ERBB3, but not ERBBt/EGFR. The ERBB2 forms a complex with 
the gp130 subunit of the IL6 receptor in an IL6-dependent manner. This association was important because the inhi- 

30 bition of ERBB2 activity resulted in abrogation of IL6-induced MAPK activation. Thus, ERBB2 is a critical component 
of IL6 signaling through the MAP kinase pathway [Qiu et al., 1 998, (27)]. These findings showed how a cytokine receptor 
can diversify its signaling pathways by engaging with a growth factor receptor kinase. 

[0086] Overexpression of ERBB2 confers Taxol resistance in breast cancers. Overexpression of ERBB2 inhibits 
Taxol-induced apoptosis [Yu et al., 1998, (28)]. Taxol activates CDC2 kinase in MDA-MB-435 breast cancer cells, 

35 leading to cell cycle arrest at the G2/M phase and, subsequently, apoptosis. A chemical inhibitor of CDC2 and a dom- 
inant-negative mutant of CDC2 blocked Taxol-induced apoptosis in these cells. Overexpression of ERBB2 in MDA-MB- 
435 cells by transfection transcriptionally upregulates CDKN1A which associates with CDC2, inhibits Taxol-mediated 
CDC2 activation, delays cell entrance to G2/M phase, and thereby inhibits Taxol-induced apoptosis. In CDKN1 A anti- 
sense-transfected MDA-MB-435 cells or in p21-/- MEF cells, ERBB2 was unable to inhibit Taxol-induced apoptosis. 

40 Therefore, CDKN1 A participates in the regulation of a G2/M checkpoint that contributes to resistance to Taxol-induced 
apoptosis in ERBB2-overexpressing breast cancer cells. 

[0087] A secreted protein of approximately 68 kD was described, designated herstatin, as the product of an alternative 
ERBB2 transcript that retains intron 8 [Doherty et al., 1999, (29)]. This alternative transcript specifies 340 residues 
identical to subdomains I and II from the extracellular domain of p1 85ERBB2, followed by a unique C-terminal sequence 

45 of 79 amino acids encoded by intron 8. The recombinant product of the alternative transcript specifically bound to 
ERBB2-transfected cells and was chemically crosslinked to p185ERBB2, whereas the intron-encoded sequence alone 
also bound with high affinity to transfected cells and associated with p185 solubilized from cell extracts. The herstatin 
mRNA was expressed in normal human fetal kidney and liver, but was at reduced levels relative to p1 85ERBB2 mRNA 
in carcinoma cells that contained an amplified ERBB2 gene. Herstatin appears to be an inhibitor of p185ERBB2, be- 

50 cause it disrupts dimers, reduces tyrosine phosphorylation of p1 85, and inhibits the anchorage-independent growth of 
transformed cells that overexpress ERBB2. The HER2 gene is amplified and HER2 is overexpressed in 25 to 30% of 
breast cancers, increasing the aggressiveness of the tumor. Finally, it was found that a recombinant monoclonal anti- 
body against HER2 increased the clinical benefit of first-line chemotherapy in metastatic breast cancer that overex- 
presses HER2 [Slamon et al., 2001 , (30)]. 

55 

GRB7 

[0088] Growth factor receptor tyrosine kinases (GF-RTKs) are involved in activating the cell cycle. Several substrates 
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of GF-RTKs contain Src-homology 2 (SH2) and SH3 domains. SH2 domain-containing proteins are a diverse group 
of molecules important in tyrosine kinase signaling. Using the CORT (cloning of receptor targets) method to screen a 
high expression mouse library, the gene for murine Grb7, which encodes a protein of 535 amino acids, was isolated 
[Margolis et al., 1992, (31)]. GRB7 is homologous to ras-GAP (ras-GTPase-activating protein). It contains an SH2 
5 domain and is highly expressed in liver and kidney. This gene defines the GRB7 family, whose members include the 
mouse.gene Grb 10 and the human gene GRB14. 

[0089] A putative GRB7 signal transduction molecule and a GRB7V novel splice variant from an invasive human 
esophageal carcinoma was isolated [Tanaka et al., 1998, (32)]. Although both GRB7 isoforms shared homology with 
the Mig-10 cell migration gene of Caenorhabditis elegans, the GRB7V isoform lacked 88 basepairs in the C terminus; 

10 the resultant frameshift led to substitution of an SH2 domain with a short hydrophobic sequence. The wildtype GRB7 
protein, but not the GRB7V isoform, was rapidly tyrosyl phosphorylated in response to EGF stimulation in esophageal 
carcinoma cells. Analysis of human esophageal tumor tissues and regional lymph nodes with metastases revealed 
that GRB7V was expressed in 40% of GRB7-positive esophageal carcinomas. GRB7V expression was enhanced after 
metastatic spread to lymph nodes as compared to the original tumor tissues. Transfection of an antisense GRB7 RNA 

15 expression construct lowered endogenous GRB7 protein levels and suppressed the invasive phenotype exhibited by 
esophageal carcinoma cells. These findings suggested that GRB7 isoforms are involved in cell invasion and metastatic 
progression of human esophageal carcinomas. By sequence analysis, The GRB7 gene was mapped to chromosome 
17q21-q22, near the topoisomerase-2 gene [Dong et al., 1997, (33)]. GRB-7 is amplified in concert with HER2 in several 
breast cancer cell lines and that GRB-7 is overexpressed in both cell lines and breast tumors. GRB-7, through its SH2 

20 domain, binds tightly to HER2 such that a large fraction of the tyrosine phosphorylated HER2 in SKBR-3 cells is bound 
to GRB-7 [Stein et al., 1994, (34)]. 

GCSF, CSF3 

25 [0090] Granulocyte colony-stimulating factor (or colony stimulating factor-3) specifically stimulates the proliferation 
and differentiation of the progenitor cells for granulocytes. The partial amino acid sequence of purified GCSF protein 
was determined, and by using oligonucleotides as probes, several GCSF cDNA clones were isolated from a human 
squamous carcinoma cell line cDNA library [Nagata et al., 1986, (35)]. Cloning of human GCSF cDNA shows that a 
single gene codes for a 177- or 180-amino acid mature protein of molecular weight 19,600. The authors found that the 

30 GCSF gene has 4 introns and that 2 different polypeptides are synthesized from the same gene by differential splicing 
of mRNA. The 2 polypeptides differ by the presence or absence of 3 amino acids. Expression studies indicate that 
both have authentic GCSF activity. A stimulatory activity from a glioblastoma multiform cell line being biologically and 
biochemically indistinguishable from GCSF produced by a bladder cell line was found in 1987. By somatic cell hybrid- 
ization and in situ chromosomal hybridization, the GCSF gene was mapped to 17q11 in the region of the breakpoint 

35 in the 15;17 translocation characteristic of acute promyelocyte leukemia [Le Beau et al., 1987, (36)]. Further studies 
indicated that the gene is proximal to the said breakpoint and that it remains on the rearranged chromosome 17. 
Southern blot analysis using both conventional and pulsed field gel electrophoresis showed no rearranged restriction 
fragments. By use of a full-length cDNA clone as a hybridization probe in human-mouse somatic cell hybrids and in 
flow-sorted human chromosomes, the gene for GCSF was mapped to 17q21-q22 lateron 

40 

THRA, THRA1, ERBA, EAR7, ERBA2, ERBA3 

[0091] Both human and mouse DNA have been demonstrated to have two distantly related classes of ERBA genes 
and that in the human genome multiple copies of one of the classes exist [Jansson et al., 1983, (37)]. A cDNA was 

45 isolated derived from rat brain messenger RNA on the basis of homology to the human thyroid receptor gene |Thompson 
et al., 1987, (38)]. Expression of this cDNA produced a high-affinity binding protein for thyroid hormones. Messenger 
RNA from this gene was expressed in tissue-specific fashion, with highest levels in the central nervous system and no 
expression in the liver. An increasing body of evidence indicated the presence of multiple thyroid hormone receptors. 
The authors suggested that there may be as many as 5 different but related loci. Many of the clinical and physiologic 

so studies suggested the existence of multiple receptors. For example, patients had been identified with familial thyroid 
hormone resistance in which peripheral response to thyroid hormones is lost or diminished while neuronal functions 
are maintained. Thyroidologists recognize a form of cretinism in which the nervous system is severely affected and 
another form in which the peripheral functions of thyroid hormone are more dramatically affected. 
[0092] The cDNA encoding a specific form of thyroid hormone receptor expressed in human liver, kidney, placenta, 

55 and brain was isolated [Nakai et al.. 1988, (39)]. Identical clones were found in human placenta. The cDNA encodes 
a protein of 490 amino acids and molecular mass of 54,824. Designated thyroid hormone receptor type alpha-2 
(THRA2), this protein is represented by mRNAs of different size in liver and kidney, which may represent tissue-specific 
processing of the primary transcript. 
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[0093] The THRA gene contains 10 exons spanning 27 kb of DNA. The last 2 exons of the gene are alternatively 
spliced. A 5-kb THRA1 mRNA encodes a predicted 410-amino acid protein; a 2.7-kb THRA2 mRNA encodes a 490-ami- 
no acid protein. A third isoform, TR-alpha-3, is derived by alternative splicing. The proximal 39 amino acids of the TH- 
alpha-2 specific sequences are deleted in TR-alpha-3. A second gene, THRB on chromosome 3, encodes 2 isoforms 

5 of TR-beta by alternative splicing. In 1989 the structure and function of the EAR1 and EAR7 genes was elucidated, 
both located on 17q21 [Miyajima et al., 1989, (40)]. The authors determined that one of the exons in the EAR7 coding 
sequence overlaps an exon of EAR1 , and that the 2 genes are transcribed from opposite DNA strands. In addition, 
the EAR7 mRNA generates 2 alternatively spliced isoforms, referred to as EAR71 and EAR72, of which the EAR71 
protein is the human counterpart of the chicken c-erbA protein. 

10 [0094] The thyroid hormone receptors, beta, alpha-1, and alpha-2 3 mRNAs are expressed in all tissues examined 
and the relative amounts of the three mRNAs were roughly parallel. None of the 3 mRNAs was abundant in liver, which 
is the major thyroid hormone-responsive organ. This led to the assumption that another thyroid hormone receptor may 
be present in liver. It was found that ERBA, which potentiates ERBB, has an amino acid sequence different from that 
of other known oncogene products and related to those of the carbonic anhydrases [Debuire et al., 1984, (41 )]. ERBA 

15 potentiates ERBB by blocking differentiation of erythroblasts at an immature stage. Carbonic anhydrases participate 
in the transport of carbon dioxide in erythrocytes. In 1986 it was shown that the ERBA protein is a high-affinity receptor 
for thyroid hormone. The cDNA sequence indicates a relationship to steroid-hormone receptors, and binding studies 
indicate that it is a receptor for thyroid hormones. It is located in the nucleus, where it binds to DNA and activates 
transcription. 

20 [0095] Maternal thyroid hormone is transferred to the fetus early in pregnancy and is postulated to regulate brain 
development. The ontogeny of TR isoforms and related splice variants in 9 first-trimester fetal brains by semi-quanti- 
tative RT-PCR analysis has been investigated. Expression of the TR-beta-1, TR-alpha-1, and TR-alpha-2 isoforms 
was detected from 8.1 weeks' gestation. An additional truncated species was detected with the TR-alpha-2 primer set, 
consistent with the TR-aipha-3 splice variant described in the rat. All TR-alpha-derived transcripts were coordinately 

25 expressed and increased approximately 8-fold between 8.1 and 13.9 weeks' gestation. A more complex ontogenic 
pattern was observed for TR-beta-1, suggestive of a nadir between 8.4 and 12.0 weeks' gestation. The authors con- 
cluded that these findings point to an important role for the TR-alpha-1 isoform in mediating maternal thyroid hormone 
action during first-trimester fetal brain development. 

[0096] The identification of the several types of thyroid hormone receptor may explain the normal variation in thyroid 

30 hormone responsiveness of various organs and the selective tissue abnormalities found in the thyroid hormone resist- 
ance syndromes. Members of sibships, who were resistant to thyroid hormone action, had retarded growth, congenital 
deafness, and abnormal bones, but had normal intellect and sexual maturation, as well as augmented cardiovascular 
activity. In this family abnormal T3 nuclear receptors in blood cells and fibroblasts have been demonstrated. The avail- 
ability of cDNAs encoding the various thyroid hormone receptors was considered useful in determining the underlying 

35 genetic defect in this family. 

[0097] The ERBA oncogene has been assigned to chromosome 17. The ERBA locus remains on chromosome 17 
in the t(1 5; 1 7) translocation of acute promyelocytic leukemia (APL). The thymidine kinase locus is probably translocated 
to chromosome 15; study of leukemia with t(17;21 ) and apparently identical breakpoint showed that TK was on 21 q+. 
By in situ hybridization of a cloned DNA probe of c-erb-A to meiotic pachytene spreads obtained from uncultured 
spermatocytes it has been concluded that ERBA is situated at 17q21 .33-1 7q22, in the same region as the break that 
generated the t(15;17) seen in APL. Because most of the grains were seen in 17q22, they suggested that ERBA is 
probably in the proximal region of 1 7q22 or at the junction between 1 7q22 and 1 7q21 .33. By in situ hybridization it has 
been demonstrated, that that ERBA remains at 17q11-q12 in APL, whereas TP53, at 17q21-q22, is translocated to 
chromosome 15. Thus, ERBA must be at 17q11.2 just proximal to the breakpoint in the APL translocation and just 

<5 distal to it in the constitutional translocation. 

[0098] The aberrant THRA expression in nonfunctioning pituitary tumors has been hypothesized to reflect mutations 
in the receptor coding and regulatory sequences. They screened THRA mRNA and THRB response elements and 
ligand-binding domains for sequence anomalies. Screening THRA mRNA from 23 tumors by RNAse mismatch and 
sequencing candidate fragments identified 1 silent and 3 missense mutations, 2 in the common THRA region and 1 

50 that was specific for the alpha-2 isoform. No THRB response element differences were detected in 14 nonfunctioning 
tumors, and no THRB ligand-binding domain differences were detected in 23 nonfunctioning tumors. Therefore it has 
been suggested that the novel thyroid receptor mutations may be of functional significance in terms of thyroid receptor 
action, and further definition of their functional properties may provide insight into the role of thyroid receptors in growth 
control in pituitary cells. 

55 

RAR-alpha 

[0099] A cDNA encoding a protein that binds retinoic acid with high affinity has been cloned [Petkovich et al., 1987, 
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(42)]. The protein was found to be homologous to the receptors for steroid hormones, thyroid hormones, and vitamin 
D3, and appeared to be a retinoic acid-inducible transacting enhancer factor. Thus, the molecular mechanisms of the 
effect of vitamin A on embryonic development, differentiation and tumor cell growth may be similar to those described 
for other members of this nuclear receptor family. In general, the DNA-binding domain is most highly conserved, both 

5 within and between the 2 groups of receptors (steroid and thyroid); Using a cDNA probe, the RAR-alpha gene has 
been mapped to 1 7q21 by in situ hybridization [Mattei et al., 1 988, (43)). Evidence has been presented for the existence 
of 2 retinoic acid receptors, RAR-alpha and RAR-beta, mapping to chromosome 17q21.1 and 3p24, respectively. The 
alpha and beta forms of RAR were found to be more homologous to the 2 closely related thyroid hormone receptors 
alpha and beta, located on 17q11.2 and 3p25-p21, respectively, than to any other members of the nuclear receptor 

10 family. These observations suggest that the thyroid hormone and retinoic acid receptors evolved by gene, and possibly 
chromosome, duplications from a common ancestor, which itself diverged rather early in evolution from the common 
ancestor of the steroid receptor group of the family. They noted that the counterparts of the human RARA and RARB 
genes are present in both the mouse and chicken. The involvement of RARA at the APL breakpoint may explain why 
the use of retinoic acid as a therapeutic differentiation agent in the treatment of acute myeloid leukemias is limited to 

15 APL. Almost al! patients with APL have a chromosomal translocation t(15;17)(q22;q21). Molecular studies reveal that 
the translocation results in a chimeric gene through fusion between the PML gene on chromosome 15 and the RARA 
gene on chromosome 17. A hormone-dependent interaction of the nuclear receptors RARA and RXRA with CLOCK 
and MOP4 has been presented. 

20 CDC18 L, CDC 6 

[0100] In yeasts, Cdc6 (Saccharomyces cerevisiae) and Cdc18 (Schizosaccharomyces pombe) associate with the 
origin recognition complex (ORC) proteins to render cells competent for DNA replication. Thus, Cdc6 has a critical 
regulatory role in the initiation of DNA replication in yeast. cDNAs encoding Xenopus and human homologues of yeast 

25 CDC6 have been isolated [Williams et al., 1997, (44)]. They designated the human and Xenopus proteins p62(cdc6). 
Independently, in a yeast 2-hybrid assay using PCNA as bait, cDNAs encoding the human CDC6/Cdc 18 homologue 
have been isolated [Saha et al, 1998, (45)]. These authors reported that the predicted 560-amino acid human protein 
shares approximately 33% sequence identity with the 2 yeast proteins. On Western blots of HeLa cell extracts, human 
CDC6/Cdc18 migrates as a 66-kD protein. Although Northern blots indicated that CDC6/Cdc18 mRNA levels peak at 

30 the onset of S phase and diminish at the onset of mitosis in HeLa cells, the authors found that total CDC6/Cdc1 8 protein 
level is unchanged throughout the cell cycle. Immunofluorescent analysis of epitope-tagged protein revealed that hu- 
man CDC6/Cdc1 8 is nuclear in G1 and cytoplasmic in S-phase cells, suggesting that DNA replication may be regulated 
by either the translocation of this protein between the nucleus and cytoplasm or by selective degradation of the protein 
in the nucleus. Immunoprecipitation studies showed that human CDC6/Cdc18 associates in vivo with cyclin A, 

35 CDK2,and ORC1 . The association of cyclin-CDK2 with CDC6/Cdc18 was specifically inhibited by a factor present in 
mitotic cell extracts. Therefore it has been suggested that if the interaction between CDC6/Cdc18 with the S phase- 
promoting factor cyclin-CDK2 is essential for the initiation of DNA replication, the mitotic inhibitor of this interaction 
could prevent a premature interaction until the appropriate time in G1 Cdc6 is expressed selectively in proliferating but 
not quiescent mammalian cells, both in culture and within tissues in intact animals [Yan et al., 1998, (46)]. During the 

^0 transition from a growth-arrested to a proliferative state, transcription of mammalian Cdc6 is regulated by E2F proteins, 
as revealed by a functional analysis of the human Cdc6 promoter and by the ability of exogenously expressed E2F 
proteins to stimulate the endogenous Cdc6 gene. Immunodepletion of Cdc6 by microinjection of anti-Cdc6 antibody 
blocked initiation of DNA replication in a human tumor cell line. The authors concluded that expression of human Cdc6 
is regulated in response to mitogenic signals through transcriptional control mechanisms involving E2F proteins, and 

45 that Cdc6 is required for initiation of DNA replication in mammalian cells. 

[0101] Using a yeast 2-hybrid system, co-purification of recombinant proteins, and immunoprecipitation, it has been 
demonstrated lateron that an N-terminal segment of CDC6 binds specifically to PR48, a regulatory subunit of protein 
phosphatase 2A (PP2A). The authors hypothesized that dephosphorylation of CDC6 by PP2A, mediated by a specific 
interaction with PR48 or a related B-double prime protein, is a regulatory event controlling initiation of DNA replication 

so in mammalian cells. By analysis of somatic cell hybrids and by fluorescence in situ hybridization the human p62(cdc6) 
gene has been to 1 7q21 .3. 

TOP2A, TOP2 

55 [0102] DNA topoisomerases are enzymes that control and alter the topologic states of DNA in both prokaryotes and 
eukaryotes. Topoisomerase II from eukaryotic cells catalyzes the relaxation of supercoiled DNA molecules, catenation, 
decatenation, knotting, and unknotting of circular DNA. It appears likely that the reaction catalyzed by topoisomerase 
II involves the crossing-over of 2 DNA segments. It has been estimated that there are about 100,000 molecules of 
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topoisom erase II per HeLa cell nucleus, constituting about 0.1% of the nuclear extract. Since several of the abnormal 
characteristics of ataxia-telangiectasia appear to be due to defects in DNA processing, screening for these enzyme 
activities in 5 AT cell lines has been performed [Singh et al., 1988, (47)]. In comparison to controls, the level of DNA 
topoisomerase II, determined by unknotting of P4 phage DNA, was reduced substantially in 4 of these cell lines and 
5 to a lesser extent in the fifth. DNA topoisomerase I, assayed by relaxation of supercoil DNA, was found to be present 
at normal levels. 

[0103] The entire coding sequence of the human TOP2 gene has been determined [Tsai-Pflugfelder et al., 1 988, (48)]. 
[01 04] In addition human cDNAs that had been isolated by screening a cDNA library derived from a mechlorethamine- 
resistant Burkitt lymphoma cell line (Raji-HN2) with a Drosophila Topo II cDNA had been sequenced [Chung et al., 

io 1989, (49)]. The authors identified 2 classes of sequence representing 2 TOP2 isoenzymes, which have been named 
TOP2A and TOP2B. The sequence of 1 of the TOP2A cDNAs is identical to that of an internal fragment of the TOP2 
cDNA isolated by Tsai-Pflugfelder et al., 1 988 (48). Southern blot analysis indicated that the TOP2A and TOP2B cDNAs 
are derived from distinct genes. Northern blot analysis using a TOP2A-specific probe detected a 6.5-kb transcript in 
the human cell line U937. Antibodies against a TOP2A peptide recognized a 170-kD protein in U937 cell lysates. 

f 5 Therefore it was concluded that their data provide genetic and immunochemical evidence for 2 TOP2 isozymes. The 
complete structures of the TOP2A and TOP2B genes has been reported [Lang et al., 1998, (50)]. The TOP2A gene 
spans approximately 30 kb and contains 35 exons. 

[0105] Tsai-Pflugfelder et al., 1988 (48) showed that the human enzyme is encoded by a single-copy gene which 
they mapped to 17q21-q22 by a combination of in situ hybridization of a cloned fragment to metaphase chromosomes 

20 and by Southern hybridization analysis with a panel of mouse-human hybrid cell lines. The assignment to chromosome 
17 has been confirmed by the study of somatic cell hybrids. Because of co-amplification in an adenocarcinoma cell 
line, it was concluded that the TOP2A and ERBB2 genes may be closely linked on chromosome 1 7 [Keith et al., 1 992, 
(51)]: Using probes that detected RFLPs at both the TOP2A and TOP2B loci, the demonstrated heterozygosity at a 
frequency of 0. 1 7 and 0.37 for the alpha and beta loci, respectively. The mouse homologue was mapped to chromosome 

25 1 11 [Kingsmore et al., 1993, (52)]. The structure and function of type II DNA topoisomerases has been reviewed [Watt 
et al., 1994, (53)]. DNA topoisomerase ll-alpha is associated with the pol II holoenzyme and is a required component 
of chromatin-dependent co-activation. Specific inhibitors of topoisomerase II blocked transcription on chromatin tem- 
plates, but did not affect transcription on naked templates. Addition of purified topoisomerase ll-alpha reconstituted 
chromatin-dependent activation activity in reactions with core pol II. Therefore the transcription on chromatin templates 

30 seems to result in the accumulation of superhelical tension, making the relaxation activity of topoisomerase II essential 
for productive RNA synthesis on nucleosomal DNA. 

IGFBP4 

35 [0106] Six structurally distinct insulin-like growth factor binding proteins have been isolated and their cDNAs cloned: 
IGFBP1, IGFBP2, IGFBP3, IGFBP4, 1GFBP5 and IGFBP6. The proteins display strong sequence homologies, sug- 
gesting that they are encoded by a closely related family of genes. The IGFBPs contain 3 structurally distinct domains 
each comprising approximately one-third of the molecule. The N-terminal domain 1 and the C-terminal domain 3 of 
the 6 human IGFBPs show moderate to high levels of sequence identity including 12 and 6 invariant cysteine residues 

40 in domains 1 and 3, respectively (IGFBP6 contains 10 cysteine residues in domain 1 ), and are thought to be the IGF 
binding domains. Domain 2 is defined primarily by a lack of sequence identity among the 6 IGFBPs and by a lack of 
cysteine residues, though it does contain 2 cysteines in IGFBP4. Domain 3 is homologous to the thyroglobulin type I 
repeat unit: Recombinant human insulin-like growth factor binding proteins 4, 5, and 6 have been characterized by 
their expression in yeast as fusion proteins with ubiquitin [Kiefer et al., 1992, (54)]. Results of the study suggested to 

45 the authors that the primary effect of the 3 proteins is the attenuation of IGF activity and suggested that they contribute 
to the control of IGF-mediated cell growth and metabolism. 

[0107] Based on peptide sequences of a purified insulin-like growth factor-binding protein (IGFBP) rat.lGFBP4 has 
been cloned by using PGR [Shimasaki et al., 1990, (55)]. They used the rat cDNA to clone the human ortholog from 
a liver cDNA library. Human IGFBP4 encodes a 258-amino acid polypeptide, which includes a 21 -amino acid signal 

50 sequence. The protein is very hydrophilic, which may facilitate its ability as a carrier protein for the IGFs in blood. 
Northern blot analysis of rat tissues revealed expression in all tissues examined, with highest expression in liver. It 
was stated that IGFBP4 acts as an inhibitor of IGF-induced bone cell proliferation. The genomic region containing the 
IGFBP gene. The gene consists of 4 exons spanning approximately 15 kb of genomic DNA has been examined [Zazzi 
et al., 1998, (56)]. The upstream region of the gene contains a TATA box and a cAMP-responsive promoter. 

55 [0108] By in situ hybridization, the IGFBP4 gene was mapped to 17q12-q21 [Bajalica et al., 1992, (57)]. Because 
the hereditary breast-ovarian cancer gene BRCA1 had been mapped to the same region, it has been investigated 
whether IGFBP4 is a candidate gene by linkage analysis of 22 BRCA1 families; the finding of genetic recombination 
suggested that it is not the BRCA1 gene [Tpnin et al., 1993, (58)]. 
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EB1 1, CCR7, CMKBR7 

[0109] Using PCR with degenerate oligonucleotides, a lymphoid-specific member of the G protein-coupled receptor 
family has been identified and mapped mapped to 17q12-q21.2 by analysis of human/mouse somatic cell hybrid DNAs 

5 and fluorescence in situ hybridization. It has been shown that this receptor had been independently identified as the 
Epstein-Barr-induced cDNA (symbol EBI1 ) [Birkenbach et al., 1 993, (59)]. EBI1 is expressed in normal lymphoid tissues 
and in several B- and T-lymphocyte cell lines. While the function and the ligand for EBI1 remains unknown, its sequence 
and gene structure suggest that it is related to receptors that recognize chemoattractants, such as interteukin-8, 
RANTES, C5a, and fMet-Leu-Phe. Like the chemoattractant receptors, EBI1 contains intervening sequences near its 

10 5-prime end; however, EBI1 is unique in that both of its introns interrupt the coding region of the first extracellular 
domain. Mouse Ebi1 cDNA has been isolated and found to encode a protein with 86% identity to the human homologue. 
[0110] Subsets of murine CD4+ T cells localize to different areas of the spleen after adoptive transfer. Naive and T 
helper- 1 (TH1) cells, which express CCR7, home to the periarteriolar lymphoid sheath, whereas activated TH2 cells, 
which lack CCR7, form rings at the periphery of the T-cell zones near B-cell follicles. It has been found that retroviral 

15 transduction of TH2 cells with CCR7 forced them to localize in a TH1-like pattern and inhibited their participation in B- 
cell help in vivo but not in vitro. Apparently differential expression of chemokine receptors results in unique cellular 
migration patterns that are important for effective immune responses. 

[0111] CCR7 expression divides human memory T cells into 2 functionally distinct subsets. CCR7-memory cells 
express receptors for migration to inflamed tissues and display immediate effector function. In contrast, CCR7+ memory 
20 cells express lymph node homing receptors and lack immediate effector function, but efficiently stimulate dendritic cells 
and differentiate into CCR7" effector cells upon secondary stimulation. The CCR7+ and CCR7" T cells, named central 
memory (T-CM) and effector memory (T-EM), differentiate in a step-wise fashion from naive T cells, persist for years 
after immunization, and allow a division of labor in the memory response. 

[0112] CCR7 expression in memory CD8 + T lymphocyte responses to HIV and to cytomegalovirus (CMV) tetramers 

25 has been evaluated. Most memory T lymphocytes express CD45RO, but a fraction express instead the CD45RA mark- 
er. Flow cytometric analyses of marker expression and cell division identified 4 subsets of HIV- and CMV-specific CD8+ 
T cells, representing a lineage differentiation pattern: CD45RA+CCR7* (double-positive); CD45RACCR7+; 
CD45RA-CCR7" (double-negative); CD45RA+CCR7-. The capacity for cell division, as measured by 5-(and 6-)carboxyl- 
fluorescein diacetate, succinimidyl ester, and intracellular staining for the Ki67 nuclear antigen, is largely confined to 

30 the CCR7+ subsets and occurred more rapidly in cells that are also CD45RA\ Although the double-negative cells did 
not divide or expand after stimulation, they did revert to positivity for either CD45RA or CCR7 or both. The 
CD45RA+CCR7" cells, considered to be terminally differentiated, fail to divide, but do produce interferon-gamma and 
express high levels of perform. The representation of subsets specific for CMV and for HIV is distinct. Approximately 
70% of HIV-specific CD8+ memory T cells are double-negative or preterminally differentiated compared to 40% of 

35 CMV-specific cells. Approximately 50% of the CMV-specific CD8 + memory T cells are terminally differentiated com- 
pared to fewer than 1 0% of the HIV-specific cells. It has been proposed that terminally differentiated CMV-specific cells 
are poised to rapidly intervene, while double-positive precursor cells remain for expansion and replenishment of the 
effector cell pool. Furthermore, high-dose antigen tolerance and the depletion of HIV-specific CD4 + helper T-cell activity 
may keep the HIV-specific memory CD8 + T cells at the double-negative stage, unable to differentiate to the terminal 

to effector state. B lymphocytes recirculate between B cell-rich compartments (follicles or B zones) in secondary lymphoid 
organs, surveying for antigen. After antigen binding, B cells move to the boundary of B and T zones to interact with T- 
helper cells. Furthermore it has been demonstrated that antigen-engaged B cells have increased expression of CCR7, 
the receptor for the T-zone chemokines CCL19 (also known as ELC) and CCL21, and that they exhibit increased 
responsiveness to both chemoattractants. In mice lacking lymphoid CCL19 and CCL21 chemokines, or with B cells 

45 that lack CCR7, antigen engagement fails to cause movement to the T zone. Using retroviral-mediated gene transfer, 
the authors demonstrated that increased expression of CCR7 is sufficient to direct B cells to the T zone. Reciprocally, 
overexpression of CXCR5, the receptor for the B-zone chemokine CXCL13, is sufficient to overcome antigen-induced 
B-cell movement to the T zone. This points toward a mechanism of B-cell relocalization in response to antigen, and 
established that cell position in vivo can be determined by the balance of responsiveness to chemoattractants made 

50 jn separate but adjacent zones. 

BAF57, SMARCE 1 

[0113] The SWI/SNF complex in S. cerevisiae and Drosophila is thought to facilitate transcriptional activation of 
55 specific genes by antagonizing chromatin-mediated transcriptional repression. The complex contains an ATP-depend- 
ent nucleosome disruption activity that can lead to enhanced binding of transcription factors. The BRGI/brm-associated 
factors, or BAF, complex in mammals is functionally related to SWI/SNF and consists of 9 to 12 subunits, some of 
which are homologous to SWI/SNF subunits. A 57-kD BAF subunit, BAF57, is present in higher eukaryotes, but not in 
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yeast. Partial coding sequence has been obtained from purified BAF57 from extracts of a human cell line [Wang et al., 
1998, (60)]. Based on the peptide sequences, they identified cDNAs encoding BAF57. The predicted 411 -amino acid 
protein contains an HMG domain adjacent to a kinesin-like region. Both recombinant BAF57 and the whole BAF com- 
plex bind 4-way junction (4WJ) DNA, which is thought to mimic the topology of DN A as it enters or exits the nucleosome. 

5 The BAF57 DNA-binding activity has characteristics similar to those of other HMG proteins. It was found that complexes 
with mutations in the BAF57 HMG domain retain their DNA-binding and nucleosome-disruption activities.. They sug- 
gested that the mechanism by which mammalian SWI/SNF-like complexes interact with chromatin may involve recog- 
nition of higher-order chromatin structure by 2 or more DNA-binding domains. RNase protection studies and Western 
blot analysis revealed that BAF57 is expressed ubiquitously. Several lines of evidence point toward the involvement 

10 of SWI/SNF factors in cancer development [Klochendler-Yeivin et al., 2002, (61)]. Moreover, SWI/SNF related genes 
are assigned to chromosomal regions that are frequently involved in somatic rearrangements in human cancers [Ring 
et al., 1 998, (62)]. In this respect it is interesting that some of the SWI/SNF family members (i.e. SMARCC 1 , SMARCC2, 
SMARCD1 and SMARCD22 are neighboring 3 of the eucaryotic ARCHEONs we have identified (i.e. 3p21-p24, 
12q13-q14 and 17q respectively )and which are part of the present invention. In this invention we could also map 

15 SMARCE1/BAF57 to the 17q12 region by PCR karyotyping. 

KRT 10, K10 

[0114] Keratin 10 is an intermediate filament (IF) chain which belongs to the acidic type I family and is expressed in 

20 terminally differentiated epidermal cells. Epithelial cells almost always co-express pairs of type I and type II keratins, 
and the pairs that are co-expressed are highly characteristic of a given epithelial tissue. For example, in human epi- 
dermis, 3 different pairs of keratins are expressed: keratins 5 (type II) and 14 (type I), characteristic of basal or prolif- 
erative cells; keratins 1 (type II) and 10 (type I), characteristic of superbasal terminally differentiating cells; and keratins 
6 (type II) and 16 (type I) (and keratin 17 [type I]), characteristic of cells induced to hyper-proliferate by disease or 

25 injury, and epithelial cells grown in cell culture. The nucleotide sequence of a 1 ,700 bp cDNA encoding human epidermal 
keratin 10 (56.5 kD) [Darmon et al., 1987, (63)] has been published as well as the complete amino acid sequence of 
human keratin 10 [Zhou et al., 1988, (64)]. Polymorphism of the KRT10 gene, restricted to insertions and deletions of 
the glycine-richquasipeptide repeats that form the glycine-loop motif in the C-terminal domain, have been extensively 
described [Korge et al., 1992, (65)]. 

30 [01 1 5] By use of specific cDNA clones in conjunction with somatic cell hybrid analysis and in situ hybridization, KRT1 0. 
gene has been mapped to 17q12-q21 in a region proximal to the breakpoint at 17q21 that is involved in a t(17;21)(q21 ; 
q22) translocation associated with a form of acute leukemia. KRT10 appeared to be telomeric to 3 other loci that map 
in the same region: CSF3, ERBA1 , and HER2 [Lessin et al., 1988, (66)]. NGFR and HOX2 are distal to K9. It has been 
demonstrated that the KRT1 0, KRT1 3, and KRT1 5 genes are located in the same large pulsed field gel electrophoresis 

35 fragment [Romano et al., 1991 , (67)]. A correlation of assignments of the 3 genes makes 17q21-q22 the likely location 
of the cluster. Transgenic mice expressing a mutant keratin 10 gene have the phenotype of epidermolytic 
hyperkeratosis , thus suggesting that a genetic basis for the human disorder resides in mutations in genes encoding 
suprabasal keratins KRT1 or KRT10 [Fuchs et al 1992, (68)]. the authors also showed that stimulation of basal cell 
proliferation can result from a defect in suprabasal cells and that distortion of nuclear shape or alterations in cytokinesis 

40 can occur when an intermediate filament network is perturbed. In a family with keratosis palmaris et plantaris without 
blistering either spontaneously or in response to mild mechanical or thermal stress and with no involvement of the skin 
and parts of the body other than the palms and soles, a tight linkage to an insertion-deletion polymorphism in the C- 
terminal coding region of the KRT10 gene (maximum lod score = 8.36 at theta = 0.00) was found [Rogaev.et al., 1993, 
(69)]. It is noteworthy that it was a rare, high molecular weight allele of the KRT1 0 polymorphism that segregated with 

45 the disorder. The allele was observed once in 96 independent chromosomes from unaffected Caucasians. The KRT1 0 
polymorphism arose from the insertion/deletion of imperfect (CCG)n repeats within the coding region and gave rise to 
a variable glycine loop motif in the C-terminal tail of the keratin 10 protein. It is possible that there was a pathogenic 
role for the expansion of the imperfect trinucleotide repeat. 

50 KRT12.K12 

[01 16] Keratins are a group of water-insoluble proteins that form 10 nm intermediate filaments in epithelial cells. 
Approximately 30 different keratin molecules have been identified. They can be divided into acidic and basic-neutral 
subfamilies according to their relative charges, immurioreactivity, and sequence homologies to types I and II wool 
55 keratins, respectively. In vivo, a basic keratin usually is co-expressed and 'paired' with a particular acidic keratin to 
form a heterodimer. The expression of various keratin pairs is tissue specific, differentiation dependent, and develop- 
mentally regulated. The presence of specific keratin pairs is essential for the maintenance of the integrity of epithelium. 
For example, mutations in human K14/K5 pair and the K10/K1 pair underlie the skin diseases, epidermolysis bullosa 
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simplex and epidermolytic hyperkeratosis, respectively. Expression of the K3 and K12 keratin pair have been found in 
the cornea of a wide number of species, including human, mouse, and chicken, and is regarded as a marker for corneal- 
type epithelial differentiation. The murine Krt12 (Krt1 .1 2) gene and demonstrated that its expression is corneal epithelial 
cell specific, differentiation dependent, and developmentally regulated [Liu et al., 1993, (70)]. The corneal-specific 

5 nature of keratin 12 gene expression signifies keratin 12 plays a unique role in maintaining normal corneal epithelial 
function. Nevertheless, the exact function of keratin 12 remains unknown and no hereditary human corneal epithelial 
disorder has been linked directly to the mutation in the keratin 12 gene. As part of a study of the expression profile of 
human corneal epithelial cells, a cDNA with an open reading frame highly homologous to the cornea-specific mouse 
keratin 12 gene has been isolated [Nishida et al., 1996, (71)]. To elucidate the function of keratin 12 knockout mice 

10 lacking the Krt1.12 gene have been created by gene targeting techniques. The heterozygous mice appeared normal. 
Homozygous mice developed normally and suffered mild corneal epithelial erosion. The corneal epithelia were fragile 
and could be removed by geptle rubbing of the eyes or brushing. The corneal epithelium of the homozygotes did not 
express keratin 12 as judged by immunohistochemistry, Western immunoblot analysis with epitope-specific anti-keratin 
12 antibodies, Northern hybridization, and in situ hybridization with an antisense keratin 12 riboprobe. The KRT12 gene 

15 has been mapped to 1 7q by study of radiation hybrids and localized it to the type I keratin cluster in the interval between 
D17S800 and D17S930 (17q12-q21) [Nishida et al., 1997, (72)]. The authors presented the exon-intron boundary 
structure of the KRT12 gene and mapped the gene to 17q12 by fluorescence in situ hybridization. The gene contains 
7 introns, defining 8 exons that cover the coding sequence. Together the exons and introns span approximately 6 kb 
of genomic DNA. 

20 [01 1 7] Meesmann corneal dystrophy is an autosomal dominant disorder causing fragility of the anterior corneal ep- 
ithelium, where the cornea-specific keratins K3 and K1 2 are expressed. Dominant-negative mutations in these keratins 
might be the cause of Meesmann corneal dystrophy. Indeed, linkage of the disorder to the K12 locus in Meesmann's 
original German kindred [Meesmann and Wilke, 1939, (73)] with Z(max) = 7.53 at theta = 0.0 has been found. In 2 
pedigrees from Northern Ireland, they found that the disorder co-segregated with K12 in one pedigree and K3 in the 

25 other. Heterozygous missense mutations in K3 or in K12 (R1 35T, V143L.) in each family have been identified. All these 
mutations occurred in highly conserved keratin helix boundary motifs, where dominant mutations in other keratins have 
been found to compromise cytoskeletal function severely, leading to keratinocyte fragility. 

[0118] The regions of the human KRT12 gene have been sequenced to enable mutation detection for all exons using 
genomic DNA as a template [Corden et al., 2000, (74)]. The authors found that the human genomic sequence spans 
30 5,9 1 9 bp and consists of 8 exons. A microsatellite dinucleotide repeat was identified within intron 3, which was highly 
polymorphic and which they developed for use in genotype analysis. In addition, 2 mutations in the helix initiation motif 
of K1 2 were found in families with Meesmann corneal dystrophy. In an American kindred, a missense M129T mutation 
was found in the KRT12 gene. They stated that a total of 8 mutations in the KRT12 gene had been reported. 

35 Genetic interactions within ARCHEONs 

[0119] Genes involved in genomic alterations (amplifications, insertions, translocations, deletions, etc.) exhibit 
changes in their expression pattern. Of particular interest are gene amplifications, which account for gene copy numbers 
>2 per cell or deletions accounting for gene copy numbers <2 per cell. Gene copy number and gene expression of the 

40 respective genes do not necessarily correlate. Transcriptional overexpression needs an intact transcriptional context, 
as determined by regulatory regions at the chromosomal locus (promoter, enhancer and silencer), and sufficient 
amounts of transcriptional regulators being present in effective combinations. This is especially true for genomic re- 
gions, which expression is tightly regulated in specific tissues or during specific developmental stages. ARCHEONs 
are specified by gene clusters of more than two genes being directly neighboured or in chromosomal order, interspersed 

45 by a maximum of 10, preferably 7, more preferably 5 or at least 1 gene. The interspersed genes are also co-amplified 
but do not directly interact with the ARCHEON. Such an ARCHEON may spread over a chromosomal region of a 
maximum of 20, more preferably 10 or at least 6 Megabases. The nature of an ARCHEON is characterized by the 
simultaneous amplification and/or deletion and the correlating expression (i.e. upregulation or downregulation respec- 
tively) of the encompassed genes in a specific tissue, cell type, cellular or developmental state or time point Such 

50 ARCHEONs are commonly conserved during evolution, as they play critical roles during cellular development. In case 
of these ARCHEONs whole gene clusters are overexpressed upon amplification as they harbor self-regulatory feedback 
loops, which stabilize gene expression and/or biological effector function even in abnormal biological settings, or are 
regulated by very similar transcription factor combinations, reflecting their simultaneous function in specific tissues at 
certain developmental stages. Therefore, the gene copy numbers correlates with the expression level especially for 

55 genes in gene clusters functioning as ARCHEONs. In case of abnormal gene expressions in neoplastic lesions it is of 
great importance to know whether the self-regulatory feedback loops have been conserved as they determine the 
biological activity of the ARCHEON gene members. 

[0120] The intensive interaction between genes in ARCHEONs is described for the 17q12 ARCHEON (Fig. 1) by 
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way of illustration not by limitation. In one embodiment the presence or absence of alterations of genes within distinct 
genomic regions are correlated with each other, as exemplified for breast cancer cell lines (Fig. 3 and Fig. 4). This 
confers to the discovery of the present invention, that multiple interactions of said gene products of defined chromo- 
somal localizations happen, that according to their respective alterations in abnormal tissue have predictive, diagnostic, 

5 prognostic and/or preventive and therapeutic value. These interactions are mediated directly or indirectly, due to the 
fact that the respective genes are part of interconnected or independent signaling networks or regulate cellular behavior 
(differentiation status, proliferative and /or apoptotic capacity, invasiveness, drug responsiveness, immune modulatory 
activities) in a synergistic, antagonistic or independent fashion. The order of functionally important genes within the 
ARCHEONs has been conserved during evolution (e.g. the ARCHEON on human chromosom 17q12 is present on 

10 mouse chromosome 11 ). Moreover, it has been found thatthe 1 7q1 2 ARCHEON is also present on human chromosome 
3p21 and 12q13, both of which are also involved in amplification events and in tumor development. Most probably 
these homologous ARCHEONs were formed by duplications and rearrangements during vertebrate evolution. Homol- 
ogous ARCHEONs consist of homologous genes and/or isoforms of specific gene families (e.g. RARA or RARB or 
RARG, THRA or THRB, TOP2A or TOP2B, RABSA or RABSB, BAF170 or BAF 155, BAF60A or BAF60B, WNT5A or 

15 WNT5B, IGFBP4 or IGFBP6). Moreover these regions are flanked by homologous chromosomal gene clusters (e.g. 
CACN, SCYA, HOX, Keratins). These ARCHEONs have diverged during evolution to fulfill their respective functions 
in distinct tissues (e.g. the 1 7q1 2 ARCHEON has one of its main functions in the central nervous system). Due to their 
tissue specific function extensive regulatory loops control the expression of the members of each ARCHEON. During 
tumor development these regulations become critical for the characteristics of the abnormal tissues with respect to 

20 differentiation, proliferation, drug responsiveness, invasiveness. It has been found that the co-am plification of genes 
within ARCHEONs can lead to co-expression of the respective gene products. Some of said genes also exhibit addi- 
tional mutations or specific patterns of polymorphisms, which are substantia! for the oncogenic capacities of these 
ARCHEONs. It is one of the critical features of such amplicons, which members of the ARCHEON have been conserved 
during tumor formation (e.g. during amplification and deletion events), thereby defining these genes as diagnostic 

25 marker genes. Moreover, the expression of the certain genes within the ARCHEON can be influenced by other members 
of the ARCHEON, thereby defining the regulatory and regulated genes as target genes for therapeutic intervention. It 
was also observed, that the expression of certain members of the ARCHEON is sensitive to drug treatment (e.g. TOP02 
alpha, RARA, THRA, HER-2) which defines these genes as "marker genes". Moreover several other genes are suitable 
for therapeutic intervention by antibodies (CACNB1. EBI1), ligands (CACNB1) or drugs like e.g. kinase inhibitors 

30 (CrkRS, CDC6). The following examples of interactions between members of ARCHEONs are offered by way of illus- 
tration, not by way of limitation. 

[0121] EBI1/CCR7 is lymphoid-specific member of the G protein-coupled receptor family. EBI1 recognizes chem- 
oattractants, such as interleukin-8, SCYAs, Rantes, C5a, and fMet-Leu-Phe. The capacity for cell division is largely 
confined to the CCR7 + subsets in lymphocytes. Double-negative cells did not divide or expand after stimulation. CCR7" 

35 cells, considered tobe terminally differentiated, fail to divide, but do produce interferon-gamma and express high levels 
of perforin. EBI1 is induced by viral activities such as the Eppstein-Barr-Virus. Therefore, EBI1 is associated with 
transformation events in lymphocytes. A functional role of EBI1 during tumor formation in non-lymphoid tissues has 
been investigated in this invention. Interestingly, also ERBA and ERBB, located in the same genomic region, are as- 
sociated with lymphocyte transformation. Moreover, ligands of the receptor (i.e. SCYA5/Rantes) are in genomic prox- 

io imity on 17q. Abnormal expression of both of these factors in lymphoid and non-lymphoid tissues establishes an au- 
torgulatory feedback loop, inducing signaling events within the respective cells. Expression of lymphoid factors has 
effect on immune cells and modulates cellular behavior. This is of particular interest with regard to abnormal breast 
tissue being infiltrated by lymphocytes. In line with this, another immunmodulatory and proliferation factor is located 
nearby on 17q12. Granulocyte colony-stimulating factor (GCSF3) specifically stimulates the proliferation and differen- 

^5 tiation of the progenitor cells for granulocytes. A stimulatory activity from a glioblastoma multiforme cell line being 
biologically and biochemically indistinguishable from GCSF produced by a bladder cell line has also been found. Col- 
ony-stimulating factors not only affects immune cells, but also induce cellular responses of non-immune cells, indicating 
possible involvement in tumor development upon abnormal expression. In addition several other genes of the 17q12 
ARCHEON are involved in proliferation, survival, differentiation of immune cells and/or lymphoblastic leukemia, such 

50 as MLLT6, ZNF144 and ZNFN1A3, again demonstrating the related functions of the gene products in interconnected 
key processes within specific cell types. Aberrant expression of more than one of these genes in non-immune cells 
constitutes signalling activities, that contribute to the oncogenic activities that derive solely from overexpression of the 
Her-2/neu gene. 

[0122] PPARBP has been found in complex with the tumorsuppressor gene of the p53 family. Moreover, PPARBP 
55 also binds to PPAR-alpha (PPARA), RAR-alpha (RARA), RXR, THRA and TR-beta-1 . Due to it's ability to bind to thyroid 
hormone receptors it has been named TRIP2 and TRAP220. In this complexes PPARBP affects gene regulatory ac- 
tivities. Interestingly, PPARBP is located in genomic proximity to its interaction partners THRA and RARA. We have 
found PPARBP to be co-amplified with THRA and RARA in tumor tissue. THRA has been isolated from avian eryth- 
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roblastosis virus in conjunction with ERBB and therefore was named ERBA. ERBA potentiates ERBB by blocking 
differentiation of erythroblasts at an immature stage. ERBA has been shown to influence ERBB expression. In this 
setting deletions of C-terminal portions of the THRA gene product are of influence. Aberrant THRA expression has 
also been found in nonfunctioning pituitary tumors, which has been hypothesized to reflect mutations in the receptor 

5 coding and regulatory sequences. THRA function promotes tumor cell development by regulating gene expression of 
regulatory genes and by influencing metabolic activities (e.g. of key enzymes of alternative metabolic pathways in 
tumors such as malic enzyme and genes responsible for lipogenesis). The observed activities of nuclear receptors not 
only reflect their transactivating potential, but are also due to posttranscriptional activities in the absence or presence 
of ligands. Co-amplification of THRA /ERBA and ERBB has been shown, but its influence on tumor development has 

10 been doubted as no overexpression could be demonstrated in breast tumors [van de, Vijver et aL, 1987, (75)]. THRA 
and RARA are part of nuclear receptor family whose function can be mediated as monomers, homodimers or het- 
erodimers. RARA regulates differentiation of a broad spectrum of cells. Interactions of hormones with ERBB expression 
has been investigated. Ligands of RARA can inhibit the expression of amplified ERBB genes in breast tumors [Offter- 
dinger et al., 1998, (76)]. As being part of this invention co-amplification and co-expression of THRA and RARA could 

15 be shown. It was also found that multiple genes, which are regulated by members of the thyroid hormone receptor - 
and retinoic acid receptor family, are differentially expressed in tumor samples, corresponding to their genomic alter- 
ations (amplification, mutation, deletion). These hormone receptor genes and respective target genes are useful to 
discriminate patient samples with respect to clinical features. 

[0123] By expression analysis of multiple normal tissues, tumor samples and tumor cell lines and subsequent clus- 
20 tering of the 17q12 region, it was found that the expression profile of Her-2/neu positive tumor cells and tumor samples 
exhibits similarities with the expression pattern of tissue from the central nervous system (Fig. 2). This is in line with 
the observed malformations in the central nervous system of Her-2/neu and THRA knock-out mice. Moreover, it was 
found that NEUROD2, a nuclear factor involved specifically in neurogenesis, is commonly expressed in the respective 
samples. This led to the definition of the 17q12 Locus as being an "ARCHEON", whose primary function in normal 
25 organ development is defined to the. central nervous system. Surprisingly, the expression of NEUROD2 was affected 
by therapeutic intervention. Strikingly, also ZNF144, TEM7, PIP5K and PPP1R1B are expressed in neuronal cells, 
where they display diverse tissue specific functions. 

[0124] In addition Her-2/neu is often co-amplified with GRB7, a downstream member of the signaling cascade being 
involved in invasive properties of tumors. Surprisingly, we have found another member of the Her-2/neu signaling 

30 cascade being overexpressed in primary breast tumors TOB1 (= "Transducer of ERBB signaling"). Strong overexpres- 
sion of TOB1 corellated with weaker overexpression of Her-2/neu, already indicating its involvement in oncogenic 
signaling activities. Amplification of Her-2/neu has been assigned to enhanced proliferative capacity, due to the iden- 
tified downstream components of the signaling cascade (e.g. Ras-Raf-MAPK). In this respect it was surprising that 
some cdc genes, which are cell cycle dependent kinases, are part of the amplicons, which upon altered expression 

35 have great impact on cell cycle progression. 

[0125] According to the observations described above the following examples of genes at 3q21-26 are offered by 
way of illustration, not by way of limitation. 

-> WNT5A, CACNA1D, THRB, RARB, TOP2B, RAB5B, SMARCC1 (BAF155), RAF, WNT7A 

40 

[0126] The following examples of genes at 12q13 are offered by way of illustration, not by way of limitation. 

-> CACNB3, Keratins. NR4A1, RAB5/13, RARgamma, STAT6, WNT10B, (GCN5), (SAS: Sarcoma Amplified Se- 
quence), SMARCC2 (BAF170), SMARCD1 (BAF60A), (GAS41: Glioma Amplified Sequence), (CHOP), Her3, 
45 KRTHB, HOX C , IGFBP6, WNT5B 

[0127] There is cross-talk between the amplified ARCHEONs described above and some other highly amplified 
genomic regions locate approximately at 1p13, 1q32, 2p16, 2q21, 3p12, 5p13, 6p12, 7p12, 7q21, 8q23, 11q13, 13q12, 
19q13, 20q1 3 and 21q11 . The above mentioned chromosomal regions are described by way of illustration not by way 

50 of limitation, as the amplified regions often span larger and/or overlapping positions at these chromosomal positions. 
[0128] Additional alterations of non-transcribed genes, pseudogenes or intergenic regions of said chromosomal lo- 
cations can be measured for prediction, diagnosis, prognosis, prevention and treatment of malignant neoplasia and 
breast cancer in particular. Some of the genes or genomic regions have no direct influence on the members of the 
ARCHEONs or the genes within distinct chromosomal regions but still retain marker gene function due to their chro- 

55 mosomal positioning in the neighborhood of functionally critical genes (e.g. Telethonin neighboring the Her-2/neu gene). 
[0129] The invention further relates to the use of: 

a) a polynucleotide comprising at least one of the sequences of SEQ ID NO: 1 to 26 or 53 to 75; 
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. b) a polynucleotide which hybridizes under stringent conditions to a polynucleotide specified in (a) encoding a 
polypeptide exhibiting the same biological function as specified for the respective sequence in Table 2 or 3 

c) a polynucleotide the sequence of which deviates from the polynucleotide specified in (a) and (b) due to the 
5 generation of the genetic code encoding a polypeptide exhibiting the same biological function as specified for the 

respective sequence in Table 2 or 3 

d) a polynucleotide which represents a specific fragment, derivative or allelic variation of a polynucleotide sequence 
specified in (a) to (c) 

10 

e) an antisense molecule targeting specifically one of the polynucleotide sequences specified in (a) to (d); 
0 a purified polypeptide encoded by a polynucleotide sequence specified in (a) to (d) 

15 g) a purified polypeptide comprising at least one of the sequences of SEQ ID NO: 27 to 52 or 76 to 98; 

h) an antibody capable of binding to one of the polynucleotide specified in (a) to (d) or a polypeptide specified in 
P) and fa) 

20 j) a reagent identified by any of the methods of claim 14 to 16 that modulates the amount or activity of a polynu- 

cleotide sequence specified in (a) to (d) or a polypeptide specified in (f) and (g) 

in the preparation of a composition for the prevention, prediction, diagnosis, prognosis or a medicament for the treat- 
ment of malignant neoplasia and breast cancer in particular. 

25 

Polynucleotides 

[0130] A "BREAST CANCER GENE" polynucleotide can be single- or double-stranded and comprises a coding se- 
quence or the complement of a coding sequence for a "BREAST CANCER GENE" polypeptide. Degenerate nucleotide 

30 sequences encoding human "BREAST CANCER GENE" polypeptides, as well as homologous nucleotide sequences 
which are at least about 50, 55, 60, 65, 70, preferably about 75, 90, 96, or 98% identical to the nucleotide sequences 
of SEQ ID NO: 1 to 26or 53 to 75 also are "BREAST CANCER GENE" polynucleotides. Percent sequence identity 
between the sequences of two polynucleotides is determined using computer programs such as ALIGN which employ 
the FASTA algorithm, using an affine gap search with a gap open penalty of -12 and a gap extension penalty of -2. 

35 Complementary DNA (cDNA) molecules, species homologues, and variants of "BREAST CANCER GENE" polynucle- 
otides which encode biologically active "BREAST CANCER GENE" polypeptides also are "BREAST CANCER GENE" 
polynucleotides. 

Preparation of Polynucleotides 

40 

[0131] A naturally occurring "BREAST CANCER GENE" polynucleotide can be isolated free of other cellular com- 
ponents such as membrane components, proteins, and lipids. Polynucleotides can be made by a ceil and isolated 
using standard nucleic acid purification techniques, or synthesized using an amplification technique, such as the 
polymerase chain reaction (PCR), or by using an automatic synthesizer. Methods for isolating polynucleotides are 
45 routine and are known in the art. Any such technique for obtaining a polynucleotide can be used to obtain isolated 
"BREAST CANCER GENE" polynucleotides. For example, restriction enzymes and probes can be used to isolate 
polynucleotide fragments which comprises "BREAST CANCER GENE" nucleotide sequences. Isolated polynucle- 
otides are in preparations which are free or at least 70, 80, or 90% free of other molecules. 

[0132] "BREAST CANCER GENE" cDNA molecules can be made with standard molecular biology techniques, using 
so "BREAST CANCER GENE" mRNA as a template. Any RNA isolation technique which does not select against the 
isolation of mRNA may be utilized for the purification of such RNA samples. See, for example, Sambrook et al., 1989, 
(77); and Ausubel, F. M. et al., 1989, (78), both of which are incorporated herein by reference in their entirety. Addi- 
tionally, large numbers of tissue samples may readily be processed using techniques well known to those of skill in 
the art, such as, for example, the single-step RNA isolation process of Chomczynski, P. (1989, U.S. Pat. No. 4,843,155), 
55 which is incorporated herein by reference in its entirety. 

[01 33] "BREAST CANCER GENE" cDN A molecules can thereafter be replicated using molecular biology techniques 
known in the art and disclosed in manuals such as Sambrook et al., 1989, (77) . An amplification technique, such as 
PCR, can be used to obtain additional copies of polynucleotides of the invention, using either human genomic DNA or 
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cDNA as a template. 

[0134] Alternatively, synthetic chemistry techniques can be used to synthesizes "BREAST CANCER GENE" poly- 
nucleotides. The degeneracy of the genetic code allows alternate nucleotide sequences to be synthesized which will 
encode a "BREAST CANCER GENE" polypeptide or a biologically active variant thereof. 

5 

Identification of differential expression 

[0135] Transcripts within the collected RNA samples which represent RNA produced by differentially expressed 
genes may be identified by utilizing a variety of methods which are ell known to those of skill in the art. For example, 
10 differential screening [Tedder, T. F. et al., 1988, (79)], subtractive hybridization [Hedrick, S. M. et al., 1984, (80); Lee, 
S. W. etal., 1984, (81)], and, preferably, differential display (Liang, P., and Pardee, A. B., 1993, U.S. Pat. No. 5,262,311, 
which is incorporated herein by reference in its entirety), may be utilized to identify polynucleotide sequences derived 
from genes that are differentially expressed. 

[0136] Differential screening involves the duplicate screening of a cDNA library in which one copy of the library is 
*5 screened with a total cell cDNA probe corresponding to the mRNA population of one cell type while a duplicate copy 
of the cDNA library is screened with a total cDNA probe corresponding to the mRNA population of a second cell type. 
For example, one cDNA probe may correspond to a total cell cDNA probe of a cell type derived from a control subject, 
while the second cDNA probe may correspond to a total cell cDNA probe of the same cell type derived from an exper- 
imental subject. Those clones which hybridize to one probe but not to the other potentially represent clones derived 
20 from genes differentially expressed in the cell type of interest in control versus experimental subjects. 

[01 37] Subtractive hybridization techniques generally involve the isolation of mRNA taken from two different sources, 
e.g., control and experimental tissue, the hybridization of the mRNA or single-stranded cDNA reverse-transcribed from 
the isolated mRNA, and the removal of all hybridized, and therefore double-stranded, sequences. The remaining non- 
hybridized, single-stranded cDNAs, potentially represent clones derived from genes that are differentially expressed 
25 in the two mRNA sources. Such single-stranded cDNAs are then used as the starting material for the construction of 
a library comprising clones derived from differentially expressed genes. 

[0138] The differential display technique describes a procedure, utilizing the well known polymerase chain reaction 
(PCR; the experimental embodiment set forth in Mullis, K. B., 1987, U.S. Pat. No. 4,683,202) which allows for the 
identification of sequences derived from genes which are differentially expressed. First, isolated RNA is reverse-tran- 

30 scribed into single-stranded cDN A, utilizing standard techniques which are well known to those of skill in the art. Primers 
for the reverse transcriptase reaction may include, but are not limited to, oligo dT-containing primers, preferably of the 
reverse primer type of oligonucleotide described below. Next, this technique uses pairs of PCR primers, as described 
below, which allow for the amplification of clones representing a random subset of the RNA transcripts present within 
any given cell. Utilizing different pairs of primers allows each of the mRNA transcripts present in a cell to be amplified. 

35 Among such amplified transcripts may be identified those which have been produced from differentially expressed 
genes. 

[0139] The reverse oligonucleotide primer of the primer pairs may contain an oligo dT stretch of nucleotides, pref- 
erably eleven nucleotides long, at its 5* end, which hybridizes to the poly(A) tail of mRNA or to the complement of a 
cDNA reverse transcribed from an mRNA poly(A) tail. Second, in order to increase the specificity of the reverse primer, 

*o the primer may contain one or more, preferably two, additional nucleotides at its 3' end. Because, statistically, only a 
subset of the mRNA derived sequences present in the sample of interest will hybridize to such primers, the additional 
nucleotides allow the primers to amplify only a subset of the mRNA derived sequences present in the sample of interest. 
This is preferred in that it allows more accurate and complete visualization and characterization of each of the bands 
representing amplified sequences. 

45 [0140] The forward primer may contain a nucleotide sequence expected, statistically, to have the ability to hybridize 
to cDNA sequences derived from the tissues of interest. The nucleotide sequence may be an arbitrary one, and the 
length of the forward oligonucleotide primer may range from about 9 to about 13 nucleotides, with about 10 nucleotides 
being preferred. Arbitrary primer sequences cause the lengths of the amplified partial cDNAs produced to be. variable, 
thus allowing different clones to be separated by using standard denaturing sequencing gel electrophoresis. PCR 

50 reaction conditions should be chosen which optimize amplified product yield and specificity, and, additionally, produce 
amplified products of lengths which may be resolved utilizing standard gel electrophoresis techniques. Such reaction 
conditions are well known to those of skill in the art, and important reaction parameters include, for example, length 
and nucleotide sequence of oligonucleotide primers as discussed above, and annealing and elongation step temper- 
atures and reaction times. The pattern of clones resulting from the reverse transcription and amplification of the mRNA 

55 of two different cell types is displayed via sequencing gel electrophoresis and compared. Differences in the two banding 
patterns indicate potentially differentially expressed genes. 

[0141] When screening for full-length cDNAs, it is preferable to use libraries that have been size-selected to include 
larger cDNAs. Randomly-primed libraries are preferable, in that they will contain more sequences which contain the 
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5' regions of genes. Use of a randomly primed library may be especially preferable for situations in which an oligo d 
(T) library does not yield a full-length cDNA. Genomic libraries can be useful for extension of sequence into 5' non- 
transcribed regulatory regions. 

[0142] Commercially available capillary electrophoresis systems can be used to analyze the size or confirm the 
5 nucleotide sequence of PCR or sequencing products. For example, capillary sequencing can employ flowable polymers 
for electrophoretic separation, four different fluorescent dyes (one for each nucleotide) which are laser activated, and 
detection of the emitted wavelengths by a charge coupled device camera. Output/light intensity can be converted to 
electrical signal using appropriate software (e.g. GENOTYPER and Sequence NAVIGATOR, Perkin Elmer; ABI), and 
the entire process from loading of samples to computer analysis and electronic data display can be computer controlled. 
10 Capillary electrophoresis is especially preferable for the sequencing of small pieces of DNA which might be present in 
limited amounts in a particular sample. 

[0143] Once potentially differentially expressed gene sequences have been identified via bulk techniques such as, 
for example, those described above, the differential expression of such putatively differentially expressed genes should 
be corroborated. Corroboration may be accomplished via, for example, such well known techniques as Northern anal- 
*5 ysis and/or RT-PCR. Upon corroboration, the differentially expressed genes may be further characterized, and may 
be identified as target and/or marker genes, as discussed, below. 

[0144] Also, amplified sequences of differentially expressed genes obtained through, for example, differential display 
may be used to isolate full length clones of the corresponding gene. The full length coding portion of the gene may 
readily be isolated, without undue experimentation, by molecular biological techniques well known, in the art. For ex- 
20 ample, the isolated differentially expressed amplified fragment may be labeled and used to screen a cDNA library. 
Alternatively, the labeled fragment may be used to screen a genomic library. 

[0145] An analysis of the tissue distribution of the mRNA produced by the identified genes may be conducted, utilizing 
standard techniques well known to those of skill in the art. Such techniques may include, for example, Northern analyses 
and RT-PCR. Such analyses provide information as to whether the identified genes are expressed in tissues expected 
25 to contribute to breast cancer. Such analyses may also provide quantitative information regarding steady state mRNA 
regulation, yielding data concerning which of the identified genes exhibits a high level of regulation in, preferably, 
tissues which may be expected to contribute to breast cancer. 

[0146] Such analyses may also be performed on an isolated cell population of a particular cell type derived from a 
given tissue. Additionally, standard in situ hybridization techniques may be utilized to provide information regarding 
30 which cells within a given tissue express the identified gene. Such analyses may provide information regarding the 
biological function of an identified gene relative to breast cancer in instances wherein only a subset of the cells within 
the tissue is thought to be relevant to breast cancer. 

Identification of co-amplified genes 

35 

[0147] Genes involved in genomic alterations (amplifications, insertions, translocations, deletions, etc.) are identified 
by PCR-based karyotyping in combination with database analysis. Of particular interest are gene amplifications, which 
account for gene copy numbers >2 per cell. Gene copy number and gene expression of the respective genes often 
correlates. Therefore clusters of genes being simultaneously overexpressed due to gene amplifications can be iden- 

40 tified by expression analysis via DNA-chip technologies or quantitative RTPCR. For example, the altered expression 
of genes due to increased or decreased gene copy numbers can be determined by GeneArray™ technologies from 
Affymetrix or qRT-PCR with the TaqMan or iCycler Systems. Moreover combination of RNA with DNA analytic enables 
highly parallel and automated characterization of multiple genomic regions of variable length with high resolution in 
tissue or single cell samples. Furthermore these assays enable the correlation of gene transcription relative to gene 

45 copy number of target genes. As there is not necessarily a linear correlation of expression level and gene copy number 
and as there are synergistic or antagonistic effects in certain gene clusters, the identification on the RNA-level is easier 
and probably more relevant for the biological outcome of the alterations especially in tumor tissue. 

Detection of co-amplified genes in malignant neoplasia 

50 ^ - - - 

[0148] Chromosomal changes are commonly detected by FISH (=Fluorescence-ln-Situ-Hybridization) and CGH 
(=Comparative Genomic Hybridization). For quantification of genomic regions genes or intergenic regions can be used. 
Such quantification measures the relative abundance of multiple genes with respect to each other (e.g. target gene 
vs. centromeric region or housekeeping genes). Changes in relative abundance can be detected in paraffin-embedded 
55 material even after extraction of RNA or genomic DNA. Measurement of genomic DNA has advantages compared to 
RNA-analysis due to the stability of DNA, which accounts for the possibility to perform also retrospective studies and 
offers multiple internal controls (genes not being altered, amplified or deleted) for standardization and exact calcula- 
tions. Moreover, PCR-analysis of genomic DNA offers the. advantage to investigate intergenic, highly variable regions 
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or combinations of SNP's (=Single Nucleotide Polymorphisms), RFLPs, VNTRs and STRs (in general polypmorphic 
markers). Determination of SNPs or polypmorphic markers within defined genomic regions (e.g. SNP analysis by "Py- 
rosequencing™") has impact on the phenotype of the genomic alterations. For example it is of advantage to determine 
combinations of polymorphisms or haplotypes in order to characterize the biological potential of genes being part of 

5 amplified alleles. Of particular interest are polypmorphic markers in breakpoint regions, coding regions or regulatory 
regions of genes or intergenic regions. By determining predictive haplotypes with defined biological or clinical outcome 
it is possible to establish diagnostic and prognostic assays with non-tumor samples from patients. Depending on wheth- 
er preferably one allele or both alleles to same extent are amplified (= linear or non-linear amplifications) haplotypes 
can be determined. Overrepresentation of specific polypmorphic markers combinations in cells or tissues with gene 

10 amplifications facilitates haplotype determination, as e.g. combinations of heterozygous polypmorphic markers in nu- 
cleic acids isolated from normal tissues, body fluids or biological samples of one patient become almost homozygous 
in neoplastic tissue of the very same patient. This "gain of homozygosity" corresponds to the measurement of altered 
genomic region due to amplification events and is suitable for identification of "gain of function"- alterations in tumors, 
which result in e.g. oncogenic or growth promoting activities. In contrast, the detection of "losses of heterozygosity" is 

is used for identification of anti-oncogenes, gate keeper genes or checkpoint genes, that suppress oncogenic activities 
and negatively regulate cellular growth processes. This intrinsic difference clearly opposes the impact of the respective 
genomic regions for tumor development and emphasizes the significance of "gain of homozygosity" measurements 
disclosed in this invention. In addition to the analyses on SNPs, a comparative approach of blood leucocyte DNA and 
tumor DNA based on VNTR detection can reveal the existance of a formerely described ARCH EON. SNP and VNTR 

20 sequences and primer sets most suitable for detection of theARCHEON at 17q11-21 are disclosed in Table 4 and Table 
6. Detection, quantification and sizing of such polymorphic markers can be achieved by methods known to those with 
skill in the art. In one embodiment of this invention we disclose the comparative measurement of amount and size of 
any of the disclosed VNTRs (Table 6) by PCR amplification and capillary electrophoresis. PCR can be carried out by 
standart protocols favorably in a linear amplification range (low cycle number) and detection by CE should be carried 

25 out by suppliers protocols (e.g. Agilent). More favorably the detection of the VNTRs disclosed in Table 6 can be carried 
out in a multiplex fashion, utilizing a variety of labeled primers (e.g. fluoreszent, radioactive, bioactive) and a suitable . 
CE detection system (e.g. ABI 310). However the detection can also be performed on slab gels consiting of highly 
concentrated agarose or polyacrylamide with a monochromal DNA stain. Enhancement of resolution can be achieved 
by appropriate primer design and length variation to give best results in multiplex PCR. 

30 [01 49] It is also of interest to determine covalent modifications of DNA (e.g. methylation) or the associated chromatin 
(e.g. acetylation or methylation of associated proteins) within the altered genomic regions, that have impact on tran- 
scriptional activity of the genes. In general, by measuring multiple, short sequences (60-300 bp) these techniques 
enable high-resolution analysis of target regions, which cannot be obtained by conventional methods such as FISH 
analytic (2-100 kb). Moreover the PCR-based DNA analysis techniques offer advantages with regard to sensitivity, 

35 specificity, multiplexing, time consumption and low amount of patient material required. These techniques can be op- 
timized by combination with microdissection or macrodissection to obtain purer starting material for analysis. 

Extending Polynucleotides 

40 [0150] In one embodiment of such a procedure for the identification and cloning of full length gene sequences, RNA 
may be isolated, following standard procedures, from an appropriate tissue or cellular source. A reverse transcription 
reaction may then be performed on the RNA using an oligonucleotide primer complimentary to the mRNA that corre- 
sponds to the amplified fragment, for the priming of first strand synthesis. Because the primer is anti-parallel to the 
mRNA, extension will proceed toward the 5' end of the mRNA. The resulting RNA hybrid may then be "tailed" with 

45 guanines using a standard terminal transferase reaction, the hybrid may be digested with RNase H, and second strand 
synthesis may then be primed with a poly-C primer. Using the two primers, the 5' portion of the gene is amplified using 
PCR. Sequences obtained may then be isolated and recombined with previously isolated sequences to generate a 
full-length cDNA of the differentially expressed genes of the invention. For a review of cloning strategies and recom- 
binant DNA techniques, see e.g., Sambrook et al., (77); and Ausubel et al., (78). 

so [01 51 ] Various PCR-based methods can be used to extend the polynucleotide sequences disclosed herein to detect 
upstream sequences such as promoters and regulatory elements. For example, restriction site PCR uses universal 
primers to retrieve unknown sequence adjacent to a known locus [Sarkar, 1993, (82)]. Genomic DNA is first amplified 
in the presence. of a primer to a linker sequence and a primer specific to the known region. The amplified sequences 
are then subjected to a second round of PCR with the same linker primer and another specific primer internal to the 

55 first one. Products of each round of PCR are transcribed with an appropriate RNA polymerase and sequenced using 
reverse transcriptase. 

[0152] Inverse PCR also can be used to amplify or extend sequences using divergent primers based on a known 
region [Triglia et al., 1988 ,(83)]. Primers can be designed using commercially available software, such as OLIGO 4.06 
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Primer Analysis software (National Biosciences Inc., Plymouth, Minn.), to be e.g. 2230 nucleotides in length, to have 
a GC content of 50% or more, and to anneal to the target sequence at temperatures about 68-72°C. The method uses 
several restriction enzymes to generate a suitable fragment in the known region of a gene. The fragment is then cir- 
cularized by intramolecular ligation and used as a PCR template. 
5 [0153] Another method which can be used is capture PCR, which involves PCR amplification of DNA fragments 
adjacent to a known sequence in human and yeast artificial chromosome DNA [Lagerstrom et al., 1991, (84)]. In this 
method, multiple restriction enzyme digestions and ligations also can be used to place an engineered double-stranded 
sequence into an unknown fragment of the DNA molecule before performing PCR. 

[0154] Additionally, PCR, nested primers, and PROMOTERFINDER libraries (CLONTECH, Palo Alto, Calif.) can be 
10 used to walk genomic DNA (CLONTECH, Palo Alto, Calif.). This process avoids the need to screen libraries and is 
useful in finding intron/exon junctions. 

[0155] The sequences of the identified genes may be used, utilizing standard techniques, to place the genes onto 
genetic maps, e.g., mouse [Copeland & Jenkins, 1991, (85)] and human genetic maps [Cohen, etal., 1993 ,(86)]. Such 
mapping information may yield information regarding the genes' importance to human disease by, for example, iden- 
15 tifying genes which map near genetic regions to which known genetic breast cancer tendencies map. 

identification of polynucleotide variants and homologues or splice variants 

[0156] Variants and homologues of the "BREAST CANCER GENE" polynucleotides described above also are 

20 "BREAST CANCER GENE" polynucleotides. Typically, homologous "BREAST CANCER GENE" polynucleotide se- 
quences can be identified by hybridization of candidate polynucleotides to known "BREAST CANCER GENE" polynu- 
cleotides under stringent conditions, as is known in the art. For example, using the following wash conditions: 2 X SSC 
(0.3 M NaCI, 0.03 M sodium citrate, pH 7^0), 0.1% SDS, room temperature twice, 30 minutes each; then 2 X SSC, 
.0.1% SDS, 50 EC once, 30 minutes; then 2 X SSC, room temperature twice, 10 minutes each homologous sequences 

25 can be identified which contain at most about 25-30% basepair mismatches. More preferably, homologous polynucle- 
otide strands contain 15-25% basepair mismatches, even more preferably 5-15% basepair mismatches. 
[0157] Species homologues of the "BREAST CANCER GENE" polynucleotides disclosed herein also can be iden- 
tified by making suitable probes or primers and screening cDNA expression libraries from other species, such as mice, 
monkeys, or yeast. Human variants of "BREAST CANCER GENE" polynucleotides can be identified, for example, by 

30 screening human cDNA expression libraries. It is well known that the T m of a double-stranded DNA decreases by 
1-1 .5°C with every 1 % decrease in homology [Bonner et al., 1 973, (87)]. Variants of human "BREAST CANCER GENE" 
polynucleotides or "BREAST CANCER GENE" polynucleotides of other species can therefore be identified by hybrid- 
izing a putative homologous "BREAST CANCER GENE" polynucleotide with a polynucleotide having a nucleotide 
sequence of one of the sequences of the SEQ ID NO: 1 to 26 or 53 to 75 or the complement thereof to form a test 

35 hybrid. The melting temperature of the test hybrid is compared with the melting temperature of a hybrid comprising 
polynucleotides having perfectly complementary nucleotide sequences, and the number or percent of basepair mis- 
matches within the test hybrid is calculated. 

[0158] Nucleotide sequences which hybridize to "BREAST CANCER GENE" polynucleotides or their complements 
following stringent hybridization and/or wash conditions also are "BREAST CANCER GENE" polynucleotides. Stringent 

40 wash conditions are well known and understood in the art and are disclosed, for example, in Sambrook et al., (77). 
Typically, for stringent hybridization conditions a combination of temperature and salt concentration should be chosen 
that is approximately 1 2-20°C below the calculated T m of the hybrid under study. The T m of a hybrid between a "BREAST 
CANCER GENE" polynucleotide having a nucleotide sequence of one of the sequences of the SEQ ID NO: 1 to 26 or 
53 to 75 or the complement thereof and a polynucleotide sequence which is at least about 50, preferably about 75, 

^5 90, 96, or 98% identical to one of those nucleotide sequences can be calculated, for example, using the equation below 
[Bolton and McCarthy, 1 962, (88): 

T m = 81.5°C - 16.6(log 10 [Na + ]) + 0.41(%G + C) - 0.63(%forrnamide) - 600/1), 

50 

where 1 = the length of the hybrid in basepairs. 
[0159] Stringent wash conditions include, for example, 4 X SSC at 65°C, or 50% formamide, 4 X SSC at 28°C, or 
0.5 X SSC, 0.1% SDS at 65°C. Highly stringent wash conditions include, for example, 0.2 X SSC at 65°C. 
[0160] The biological function of the identified genes may be more directly assessed by utilizing relevant in vivo and 
55 in vitro systems. In vivo systems may include, but are not limited to, animal systems which naturally exhibit breast 
cancer predisposition, or ones which have been engineered to exhibit such symptoms, including but not limited to the 
apoE-deficient malignant neoplasia mouse model [Plump et al., 1992, (89)]. 

[0161] Splice variants derived from the same genomic region, encoded by the same pre mRNA can be identified by 
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hybridization conditions described above for homology search. The specific characteristics of variant proteins encoded 
by splice variants of the same pre transcript may differ and can also be assayed as disclosed. A "BREAST CANCER 
GENE" polynucleotide having a nucleotide sequence of one of the sequences of the SEQ ID NO: 1 to 26 or 53 to 75 
or the complement thereof may therefor differ in parts of the entire sequence as presented for SEQ ID NO: 60 and the 
5 encoded splice variants SEQ ID NO: 61 to 66. These refer to individual proteins SEQ ID NO: 83 to 89. The prediction 
of splicing events and the identification of the utilized acceptor and donor sites within the pre mRNA can be computed 
(e.g. Software Package GRAIL or GenomeSCAN) and verified by PCR method by those with skill in the art. 

Antisense oligonucleotides 

10 

[0162] Antisense oligonucleotides are nucleotide sequences which are complementary to a specific DNA or RNA 
sequence. Once introduced into a cell, the complementary nucleotides combine with natural sequences produced by 
the cell to form complexes and block either transcription or translation. Preferably, an antisense oligonucleotide is at 
least 6 nucleotides in length, but can be at least 7, 8, 1 0, 1 2, 1 5, 20, 25, 30, 35, 40, 45, or 50 or more nucleotides long. 

15 Longer sequences also can be used. Antisense oligonucleotide molecules can be provided in a DNA construct and 
introduced into a cell as described above to decrease the level of "BREAST CANCER GENE" gene products in the cell. 
[0163] Antisense oligonucleotides can be deoxyribonucleotides, ribonucleotides, peptide nucleic acids (PNAs; de- 
scribed in U.S. Pat. No. 5,714,331), locked nucleic acids (LNAs; described in WO 99/12826), or a combination of them. 
Oligonucleotides can be synthesized manually or by an automated synthesizer, by covalently linking the 5* end of one 

20 nucleotide with the 3' end of another nucleotide with non-phosphodiester internucleotide linkages such alkylphospho- 
nates, phosphorothioates, phosphorodithioates, alkylphosphonothioates, aikylphosphonates, phosphoramidates, 
phosphate esters, carbamates, acetamidate, carboxymethyl esters, carbonates, and phosphate triestersfBrown, 1994, 
(126); Sonveaux, 1994, (127) and Uhlmann et al., 1990, (128)]. 

[0164] Modifications of "BREAST CANCER GENE" expression can be obtained by designing antisense oligonucle- 
25 otides which will form duplexes to the control, 5', or regulatory regions of the "BREAST CANCER GENE". Oligonucle- 
otides derived from the transcription initiation site, e.g., between positions 10 and +10 from the start site, are preferred. 
Similarly, inhibition can be achieved using "triple helix" base-pairing methodology. Triple helix pairing is useful because 
it causes inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, transcription 
factors, or chaperons. Therapeutic advances using triplex DNA have been described in the literature [Gee et al., 1 994, 
30 (129)]. An antisense oligonucleotide also can be designed to block translation of mRNA by preventing the transcript 
from binding to ribosomes. 

[0165] Precise complementarity is not required for successful complex formation between an antisense oligonucle- 
otide and the complementary sequence of a "BREAST CANCER GENE" polynucleotide. Antisense oligonucleotides 
which comprise, for example, 2, 3, 4, or 5 or more stretches of contiguous nucleotides which are precisely comple- 

35 mentary to a "BREAST CANCER GENE" polynucleotide, each separated by a stretch of contiguous nucleotides which 
are not complementary to adjacent "BREAST CANCER GENE" nucleotides, can provide sufficient targeting specificity 
for "BREAST CANCER GENE" mRNA. Preferably, each stretch of complementary contiguous nucleotides is at least 
4, 5, 6, 7, or 8 or more nucleotides in length. Non-complementary intervening sequences are preferably 1, 2, 3, or 4 
nucleotides in length. One skilled in the art can easily use the calculated melting point of an antisense-sense pair to 

to determine the degree of mismatching which will be tolerated between a particular antisense oligonucleotide and a 
particular "BREAST CANCER GENE" polynucleotide sequence. 

[0166] Antisense oligonucleotides can be modified without affecting their ability to hybridize to a "BREAST CANCER 
GENE" polynucleotide. These modifications can be internal or at one or both ends of the antisense molecule. For 
example, inter-nucleoside phosphate linkages can be modified by adding cholesteryl or diamine moieties with varying 
45 numbers of carbon residues between the amino groups and terminal ribose. Modified bases and/or sugars, such as 
arabinose instead of ribose, or a 3', 5' substituted oligonucleotide in which the 3' hydroxy! group or the 5' phosphate 
group are substituted, also can be employed in a modified antisense oligonucleotide. These modified oligonucleotides 
can be prepared by methods well known in the art[ art[ Agrawal et al., 1992, (130); Uhlmann et al., 1987, (131) and 
Uhlmann et al., (128)]. 

50 

Ribozymes 

[0167] Ribozymes are RNA molecules with catalytic activity [Cech, 1987, (132); Cech t 1990, (133) and Couture & 
Stinchcomb, 1996, (134)]. Ribozymes can be used to inhibit gene function by cleaving an RNA sequence, as is known 
55 in the art (e.g., Haseloff et al., U.S. Patent 5,641 ,673). The mechanism of ribozyme action involves sequence-specific 
hybridization of the ribozyme molecule to complementary target RNA, followed by endonucleolytic cleavage. Examples 
include engineered hammerhead motif ribozyme molecules that can specifically and efficiently catalyze endonucleolytic 
cleavage of specific nucleotide sequences. 
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[0168] The transcribed sequence of a "BREAST CANCER GENE** can be used to generate ribozymes which will 
specifically bind to mRNA transcribed from a "BREAST CANCER GENE" genomic locus. Methods of designing and 
constructing ribozymes which can cleave other RNA molecules in trans in a highly sequence specific manner have 
been developed and described in the art [Haseloff et al., 1988, (135)]. For example, the cleavage activity of ribozymes 
5 can be targeted to specific RNAs by engineering a discrete "hybridization" region into the ribozyme. The hybridization 
region contains a sequence complementary to the target RNA and thus specifically hybridizes with the target [see, for 
example, Gerlach et al., EP 0 321201]. 

[0169] Specific ribozyme cleavage sites within a "BREAST CANCER GENE" RNA target can be identified by scan- 
ning the target molecule for ribozyme cleavage sites which include the following sequences: GUA, GUU, and GUC. 

10 Once identified, short RNA sequences of between 15 and 20 ribonucleotides corresponding to the region of the target 
RNA containing the cleavage site can be evaluated for secondary structural features which may render the target 
inoperable. Suitability of candidate "BREAST CANCER GENE" RNA targets also can be evaluated by testing acces- 
sibility to hybridization with complementary oligonucleotides using ribonuclease protection assays. Longer comple- 
mentary sequences can be used to increase the affinity of the hybridization sequence for the target. The hybridizing 

15 and cleavage regions of the ribozyme can be integrally related such that upon hybridizing to the target RNA through 
the complementary regions, the catalytic region of the ribozyme can cleave the target. 

[0170] Ribozymes can be introduced into cells as part of a DNA construct. Mechanical methods, such as microin- 
jection, liposome-mediated transfection, electroporation, or calcium phosphate precipitation, can be used to introduce 
a ribozyme-containing DNA construct into cells in which it is desired to decrease "BREAST CANCER GENE" expres- 

20 sion. Alternatively, if it is desired that the cells stably retain the DNA construct, the construct can be supplied on a 
. plasmid and maintained as a separate element or integrated into the genome of the cells, as is known in the art. A 
ribozyme-encoding DNA construct can include transcriptional regulatory elements, such as a promoter element, an 
enhancer or UAS element, and a transcriptional terminator signal, for controlling transcription of ribozymes in the cells. 
[0171] As taught in Haseloff et al., U.S Pat. No. 5.641 ,673, ribozymes can be engineered so that ribozyme expression 

25 will occur in response to factors which induce expression of a target gene. Ribozymes also can be engineered to 
provide an additional level of regulation, so that destruction of mRNA occurs only when both a ribozyme and a target 
gene are induced in the cells. 

Polypeptides 

30 

[0172] "BREAST CANCER GENE" polypeptides according to the invention comprise an polypeptide selected from 
SEQ ID NO: 27 to 52 and 76 to 98 or encoded by any of the polynucleotide sequences of the SEQ ID NO: 1 to 26 and 
53 to 75 or derivatives, fragments, analogues and homologues thereof. A "BREAST CANCER GENE" polypeptide of 
the invention therefore can be a portion, a full-length, or a fusion protein comprising all or a portion of a "BREAST 
35 CANCER GENE" polypeptide. 

Protein Purification 

' [0173] "BREAST CANCER GENE" polypeptides can be purified from any cell which expresses the enzyme, including 
io host cells which have been transfected with "BREAST CANCER GENE" expression constructs. Breast tissue is an 
especially useful source of "BREAST CANCER GENE" polypeptides. A purified "BREAST CANCER GENE" polypep- 
tide is separated from other compounds which normally associate with the "BREAST CANCER GENE" polypeptide in 
the cell, such as certain proteins, carbohydrates, or lipids, using methods well-known in the art. Such methods include, 
but are not limited to, size exclusion chromatography, ammonium sulfate fractionation, ion exchange chromatography, 
45 affinity chromatography, and preparative gel electrophoresis. A preparation of purified "BREAST CANCER GENE" 
polypeptides is at least 80% pure; preferably, the preparations are 90%, 95%, or 99% pure. Purity of the preparations 
can be assessed by any means known in the art, such as SDS-polyacrylamide gel electrophoresis. 

Obtaining Polypeptides 

50 ~ " ~~ " • 

[0174] "BREAST CANCER GENE" polypeptides can be obtained, for example, by purification from human cells, by 
expression of "BREAST CANCER GENE" polynucleotides, or by direct chemical synthesis. 

Biologically Active Variants 

55 

[0175] "BREAST CANCER GENE" polypeptide variants which are biologically active, i.e., retain an "BREAST CAN- 
CER GENE" activity, also are "BREAST CANCER GENE" polypeptides. Preferably, naturally or non-naturally occurring 
"BREAST CANCER GENE" polypeptide variants have amino acid sequences which are at least about 60, 65, or 70, 
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preferably about 75, 80, 85, 90, 92, 94, 96, or 98% identical to the any of the amino acid sequences of the polypeptides 
of SEQ ID NO: 27 to 52 or 76 to 98 or the polypeptides encoded by any of the polynucleotides of SEQ ID NO: 1 to 26 
or 53 to 75 or a fragment thereof. Percent identity between a putative "BREAST CANCER GENE" polypeptide variant 
and of the polypeptides of SEQ ID NO: 27 to 52 or 76 to 98 or the polypeptides encoded by any of the polynucleotides 
5 of SEQ ID NO: 1 to 26 or 53 to 75 or a fragment thereof is determined by conventional methods. [See, for example, 
Altschul era/., 1986, (90 and Henikoff & Henikoff. 1992, (91 )]. Briefly, two amino acid sequences are aligned to optimize 
the alignment scores using a gap opening penalty of 10, a gap extension penalty of 1, and the "BLOSUM62" scoring 
matrix of Henikoff& Henikoff, (91 ). 

[0176] Those skilled in the art appreciate that there are many established algorithms available to align two amino 

10 acid sequences. The "FASTA" similarity search algorithm of Pearson & Lipman is a suitable protein alignment method 
for examining the level of identity shared by an amino acid sequence disclosed herein and the amino acid sequence 
of a putative variant [Pearson & Lipman, 1988, (92), and Pearson, 1990, (93)]. Briefly, FASTA first characterizes se- 
quence similarity by identifying regions shared by the query sequence (e.g., SEQ ID NO: 1 to 26 or 53 to 75) and a 
test sequence that have either the highest density of identities (if the ktup variable is 1 ) or pairs of identities (if ktup=2), 

15 without considering conservative amino acid substitutions, insertions, or deletions. The ten regions with the highest 
density of identities are then rescored by comparing the similarity of all paired amino acids using an amino acid sub- 
stitution matrix, and the ends of the regions are "trimmed" to include only those residues that contribute to the highest 
score. If there are several regions with scores greater than the "cutoff" value (calculated by a predetermined formula 
based upon the length of the sequence the ktup value), then the trimmed initial regions are examined to determine 

20 whether the regions can be joined to form an approximate alignment with gaps. Finally, the highest scoring regions of 
the two amino acid sequences are aligned using a modification of the Needleman-Wunsch-Sellers algorithm [Needle- 
man & Wunsch, 1970, (94), and Sellers, 1974, (95)], which allows for amino acid insertions and deletions. Preferred 
parameters for FASTA analysis are: ktup=1, gap opening penalty=10, gap extension penalty=1 , and substitution ma- 
trix=BLOSUM62. These parameters can be introduced into a FASTA program by modifying the scoring matrix file 

25 ("SMATRIX"), as explained in Appendix 2 of Pearson, (93). 

[01 77] . FASTA can also be used to determine the sequence identity of nucleic acid molecules using a ratio as disclosed 
above. For nucleotide sequence comparisons, the ktup value can range between one to six, preferably from three to 
six, most preferably three, with other parameters set as default. 

[0178] Variations in percent identity can be due, for example, to amino acid substitutions, insertions, or deletions. 

30 Amino acid substitutions are defined as one for one amino acid replacements. They are conservative in nature when 
the substituted amino acid has similar structural and/or chemical properties. Examples of conservative replacements 
are substitution of a leucine with an isoleucine or valine, an aspartate with a glutamate, or a threonine with a serine. 
[0179] Amino acid insertions or deletions are changes to or within an amino acid sequence. They typically fall in the 
range of about 1 to 5 amino acids. Guidance in determining which amino acid residues can be substituted, inserted, 

35 or deleted without abolishing biological or immunological activity of a "BREAST CANCER GENE" polypeptide can be 
found using computer programs well known in the art, such as DNASTAR software. Whether an amino acid change 
results in a biologically active "BREAST CANCER GENE" polypeptide can readily be determined by assaying for 
"BREAST CANCER GENE" activity, as described for example, in the specific Examples, below. Larger insertions or 
deletions can also be caused by alternative splicing. Protein domains can be inserted or deleted without altering the 

40 main activity of the protein. 

Fusion Proteins 

[0180] Fusion proteins are useful, for generating antibodies against "BREAST CANCER GENE" polypeptide amino 
45 acid sequences and for use in various assay systems. For example, fusion proteins can be used to identify proteins 
which interact with portions of a "BREAST CANCER GENE" polypeptide. Protein affinity chromatography or library- 
based assays for protein-protein interactions, such as the yeast two-hybrid or phage display systems, can be used for 
this purpose. Such methods are well known in the art and also can be used as drug screens. 
[0181] A "BREAST CANCER GENE" polypeptide fusion protein comprises two polypeptide segments fused together 
so by means of a peptide bond. The first polypeptide segment comprises at least 25, 50, 75, 100, 150, 200, 300, 400, 
500, 600, 700 or 750 contiguous amino acids of an amino acid sequence encoded by any polynucleotide sequences 
of the SEQ ID NO: 1 to 26 or 53 to 75 or of a biologically active variant, such as those described above. The first 
polypeptide segment also can comprise full-length "BREAST CANCER GENE". 

[0182] The second polypeptide segment can be a full-length protein or a protein fragment. Proteins commonly used 
55 in fusion protein construction include fi-galactosidase, fi-glucuronidase, green fluorescent protein (GFP), autofluores- 
cent proteins, including blue fluorescent protein (BFP), glutathione-S-transf erase (GST), luciferase, horseradish per- 
oxidase (HRP), and chloramphenicol acetyltransferase (CAT). Additionally, epitope tags are used in fusion protein 
constructions, including histidine (His) tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV-G tags. 
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and thioredoxin (Trx) tags. Other fusion constructions can include maltose binding protein (MBP), S-tag, Lex a DNA 
binding domain (DBD) fusions, GAL4 DNA binding domain fusions, and herpes simplex virus (HSV) BP16 protein 
fusions. A fusion protein also can be engineered to contain a cleavage site located between the "BREAST CANCER 
GENE" polypeptide-encoding sequence and the heterologous protein sequence, so that the "BREAST CANCER 

5 GENE" polypeptide can be cleaved and purified away from the heterologous moiety. 

[01 83] A fusion protein can be synthesized chemically, as is known in the art. Preferably, a fusion protein is produced 
by covalently linking two polypeptide segments or by standard procedures in the art of molecular biology. Recombinant 
DNA methods can be used to prepare fusion proteins, for example, by making a DNA construct which comprises coding 
sequences selected from any of the polynucleotide sequences of the SEQ ID NO: 1 to 26 and 53 to 75 in proper reading 

10 frame with nucleotides encoding the second polypeptide segment and expressing the DNA construct in a host cell, as 
is known in the art. Many kits for constructing fusion proteins are available from companies such as Promega Corpo- 
ration (Madison, Wl), Stratagene (La Jolia, CA), CLONTECH (Mountain View, CA), Santa Cruz Biotechnology (Santa 
Cruz, CA), MBL International Corporation (MIC; Watertown, MA), and Quantum Biotechnologies (Montreal, Canada; 
1-888-DNA-KITS). 

Identification of Species Homologues 

[0184] Species homologues of human a "BREAST CANCER GENE" polypeptide can be obtained using "BREAST 
CANCER GENE" polypeptide polynucleotides (described below) to make suitable probes or primers for screening 
20 cDNA expression libraries from other species, such as mice, monkeys, or yeast, identifying cDNAs which encode 
homologues of a "BREAST CANCER GENE" polypeptide, and expressing the cDNAs as is known in the art. 

Expression of Polynucleotides 

25 [0185] To express a "BREAST CANCER GENE" polynucleotide, the polynucleotide can be inserted into an expres- 
sion vector which contains the necessary elements for the transcription and translation of the inserted coding sequence. 
Methods which are well known to those skilled in the art can be used to construct expression vectors containing se- 
quences encoding "BREAST CANCER GENE" polypeptides and appropriate transcriptional and translational control 
elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic 

30 recombination. Such techniques are described, for example, in Sambrook et al., (77) and in Ausubel et al., (78). 

[0186] A variety of expression vector/host systems can be utilized to contain and express sequences encoding a 
"BREAST CANCER GENE" polypeptide. These include, but are not limited to, microorganisms, such as bacteria trans- 
formed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast 
expression vectors, insect cell systems infected with virus expression vectors (e.g., baculovirus), plant cell systems 

35 transformed with virus expression vectors (e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) or with 
bacterial expression vectors (e.g., Ti or pBR322 plasmids), or animal cell systems. 

[0187] The control elements or regulatory sequences are those regions of the vector enhancers, promoters, 5' and 
3' untranslated regions which interact with host cellular proteins to carry out transcription and translation. Such elements 
can vary in their strength and specificity. Depending on the vector system and host utilized, any number of suitable 

40 transcription and translation elements, including constitutive and inducible promoters, can be used. For example, when 
cloning in bacterial systems, inducible promoters such as the hybrid lacZ promoter of the BLUESCRIPT phagemid 
(Stratagene, LaJolla, Calif.) or pSPORTI plasmid (Life Technologies) and the like can be used. The baculovirus pol- 
yhedrin promoter can be used in insect cells. Promoters or enhancers derived from the genomes of plant cells (e.g., 
heat shock, RUBISCO, and storage protein genes) or from plant viruses (e.g., viral promoters or leader sequences) 

45 can be cloned into the vector. In mammalian cell systems, promoters from mammalian genes or from mammalian 
viruses are preferable. If it is necessary to generate a cell line that contains multiple copies of a nucleotide sequence 
encoding a "BREAST CANCER GENE" polypeptide, vectors based on SV40 or EBV can be used with an appropriate 
selectable marker. 

50 Bacterial and Yeast Expression Systems 

[0188] In bacterial systems, a number of expression vectors can be selected depending upon the use intended for 
the "BREAST CANCER GENE" polypeptide. For example, when a large quantity of the "BREAST CANCER GENE" 
polypeptide is needed for the induction of antibodies, vectors which direct high level expression of fusion proteins that 
55 are readily purified can be used. Such vectors include, but are not limited to, multifunctional E. coli cloning and expres- 
sion vectors such as BLUESCRIPT (Stratagene). In a BLUESCRIPT vector, a sequence encoding the "BREAST CAN- 
CER GENE" polypeptide can be ligated into the vector in frame with sequences for the amino terminal Met and the 
subsequent 7 residues of (J-galactosidase so that a hybrid protein is produced. pIN vectors [Van Heeke & Schuster, 
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(17)] or pGEX vectors (Promega, Madison, Wis.) also can be used to express foreign polypeptides as fusion proteins 
with glutathione S-transferase (GST). 

[0189] In general, such fusion proteins are soluble and can easily be purified from lysed cells by adsorption to glu- 
tathione agarose beads followed by elution in the presence of free glutathione. Proteins made in such systems can be 
5 designed to include heparin, thrombin, or factor Xa protease cleavage sites so that the cloned polypeptide of interest 
can be released from the GST moiety at will. 

[0190] In the yeast Saccharomyces cerevisiae, a number of vectors containing constitutive or inducible promoters 
such as alpha factor, alcohol oxidase, and PGH can be used. For reviews, see Ausubel et al. ( (4) and Grant et a!., (18). 

10 Plant and Insect Expression Systems 

[0191] If plant expression vectors are used, the expression of sequences encoding "BREAST CANCER GENE" 
polypeptides can be driven by any of a number of promoters. For example, viral promoters such as the 35S and 19S 
promoters of CaMV can be used alone or in combination with the omega leader sequence from TMV [Takamatsu, 
15 1 987, (96)]. Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock promoters can be used 
[Coruzzi et al., 1984, (97); Broglie et ah, 1984, (98); Winter et al., 1991 , (99)]. These constructs can be introduced into 
plant cells by direct DNA transformation or by pathogen-mediated transfection. Such techniques are described in a 
number of generally available reviews. 

[0192] An insect system also can be used to express a "BREAST CANCER GENE" polypeptide. For example, in 
20 one such system Autographa californica nuclear polyhedrosis virus (AcNPV) is used as a vector to express foreign 
genes in Spodoptera frugiperda cells or in Trichoplusia larvae. Sequences encoding "BREAST CANCER GENE" 
polypeptides can be cloned into a nonessential region of the virus, such as the polyhedrin gene, and placed under 
control of the polyhedrin promoter. Successful insertion of "BREAST CANCER GENE" polypeptides will render the 
polyhedrin gene inactive and produce recombinant virus lacking coat protein. The recombinant viruses can then be 
25 used to infect S. frugiperda cells or Trichoplusia larvae in which "BREAST CANCER GENE" polypeptides can be 
expressed [Engelhard et al., 1 994, (100)]. 

Mammalian Expression Systems 

30 [0193] A number of viral-based expression systems can be used to express "BREAST CANCER GENE" polypeptides 
in mammalian host cells. For example, if an adenovirus is used as an expression vector, sequences encoding "BREAST 
CANCER GENE" polypeptides can be ligated into an adenovirus transcription/translation complex comprising the late 
promoter and tripartite leader sequence. Insertion in a nonessential E1 or E3 region of the viral genome can be used 
to obtain a viable virus which is capable of expressing a "BREAST CANCER GENE" polypeptide in infected host cells 

35 [Logan & Shenk, 1984, (101)]. If desired, transcription enhancers, such as the Rous sarcoma virus (RSV) enhancer, 
can be used to increase expression in mammalian host cells. 

[0194] Human artificial chromosomes (HACs) also can be used to deliver larger fragments of DNA than can be 
contained and expressed in a plasmid. HACs of 6M to 10M are constructed and delivered to cells via conventional 
delivery methods (e.g., liposomes, polycationic amino polymers, or vesicles). 

40 [0195] Specific initiation signals also can be used to achieve more efficient translation of sequences encoding 
"BREAST CANCER GENE" polypeptides. Such signals include the ATG initiation codon and adjacent sequences. In 
cases where sequences encoding a "BREAST CANCER GENE" polypeptide, its initiation codon, and upstream se- 
quences are inserted into the appropriate expression vector, no additional transcriptional or translational control signals 
may be needed. However, in cases where only coding sequence, or a fragment thereof, is inserted, exogenous trans- 

45 lational control signals (including the ATG initiation codon) should be provided. The initiation codon should be in the 
correct reading frame to ensure translation of the entire insert. Exogenous translational elements and initiation codons 
can be of various origins, both natural and synthetic. The efficiency of expression can be enhanced by the inclusion 
of enhancers which are appropriate for the particular cell system which is used [Scharf et al., 1994, (102)]. 

50 Host Cells 

[0196] A host cell strain can be chosen for its ability to modulate the expression of the inserted sequences or to 
process the expressed "BREAST CANCER GENE" polypeptide in the desired fashion. Such modifications of the 
polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and 
55 acylation. Posttranslational processing which cleaves a "prepro" form of the polypeptide also can be used to facilitate 
correct insertion, folding and/or function. Different host cells which have specific cellular machinery and characteristic 
mechanisms for Post-translational activities (e.g., CHO, HeLa, MDCK, HEK293. and WI38), are available from the 
American Type Culture Collection (ATCC; 1 0801 University Boulevard, Manassas, VA 20110-2209) and can be chosen 
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to ensure the correct modification and processing of the foreign protein. 

[0197] Stable expression is preferred for long-term, high-yield production of recombinant proteins. For example, cell 
lines which stably express "BREAST CANCER GENE" polypeptides can be transformed using expression vectors 
which can contain viral origins of replication and/or endogenous expression elements and a selectable marker gene 

5 on the same or on a separate vector. Following the introduction of the vector, cells can be allowed to grow for 12 days 
in an enriched medium before they are switched to a selective medium. The purpose of the selectable marker is to 
confer resistance to selection, and its presence allows growth and recovery of cells which successfully express the 
introduced "BREAST CANCER GENE" sequences. Resistant clones of stably transformed cells can be proliferated 
using tissue culture techniques appropriate to the cell type [Freshney et al., 1986, (103). 

10 [0198] Any number of selection systems can be used to recover transformed cell lines. These include, but are not 
limited to, the herpes simplex virus thymidine kinase (Wigler et al., 1 977, (1 04)] and adenine phosphoribosyltransferase 
[Lowy et al., 1 980. (105)] genes which can be employed in tk- or aprt cells, respectively. Also, antimetabolite, antibiotic, 
or herbicide resistance can be used as the basis for selection. For example, dhfr confers resistance to methotrexate 
[Wigler et al., 1980, (106)], npt confers resistance to the aminoglycosides, neomycin and G418 [Colbere-Garapin et 

15 al.* 1981 , (107)], and als and pat confer resistance to chlorsulfuron and phosphinotricin acetyltransferase, respectively. 
Additional selectable genes have been described. For example, trpB allows cells to utilize indole in place of tryptophan, 
or hisD, which allows cells to utilize histinol in place of histidine [Hartman & Mulligan, 1988 ,(108)]. Visible markers 
such as anthocyanins, 0-glucuronidase and its substrate GUS, and luciferase and its substrate luciferin, can be used 
to identify transfbrmants and to quantify the amount of transient or stable protein expression attributable to a specific 

20 vector system [Rhodes et al., 1995, (1 09)]. 

Detecting Expression and gene product 

[0199] Although the presence of marker gene expression suggests that the "BREAST CANCER GENE" polynucle- 

25 otide is also present, its presence and expression may need to be confirmed. For example, if a sequence encoding a 
"BREAST CANCER GENE" polypeptide is inserted within a marker gene sequence, transformed cells containing se- 
quences which encode a "BREAST CANCER GENE" polypeptide can be identified by the absence of marker gene 
function. Alternatively, a marker gene can be placed in tandem with a sequence encoding a "BREAST CANCER GENE" 
polypeptide under the control of a single promoter. Expression of the marker gene in response to induction or selection 

30 usually indicates expression of the "BREAST CANCER GENE" polynucleotide. 

[0200] Alternatively, host cells which contain a "BREAST CANCER GENE" polynucleotide and which express a 
"BREAST CANCER GENE" polypeptide can be identified by a variety of procedures known to those of skill in the art. 
These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridization and protein bioassay or immu- 
noassay techniques which include membrane, solution, or chip-based technologies for the detection and/or quantifi- 

35 cation of polynucleotide or protein. For example, the presence of a polynucleotide sequence encoding a "BREAST 
CANCER GENE" polypeptide can be detected by DNA-DNA or DNA-RNA hybridization or amplification using probes 
or fragments or fragments of polynucleotides encoding a "BREAST CANCER GENE" polypeptide. Nucleic acid ampli- 
fication-based assays involve the use of oligonucleotides selected from sequences encoding a "BREAST CANCER 
GENE" polypeptide to detect transformants which contain a "BREAST CANCER GENE" polynucleotide. 

to [0201] A variety of protocols for detecting and measuring the expression of a "BREAST CANCER GENE" polypeptide, 
using either polyclonal or monoclonal antibodies specific for the polypeptide, are known in the art. Examples include 
enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), and fluorescence activated cell sorting 
(FACS). A two-site, monoclonal-based immunoassay using monoclonal antibodies reactive to two non-interfering 
epitopes on a "BREAST CANCER GENE" polypeptide can be used, or a competitive binding assay can be employed. 

45 These and other assays are described in Hampton et al., (110) and Maddox et al., 111 ). 

[0202] A wide variety of labels and conjugation techniques are known by those skilled in the art and can be used in 
various nucleic acid and amino acid assays. Means for producing labeled hybridization or PCR probes for detecting 
sequences related to polynucleotides encoding "BREAST CANCER GENE" polypeptides include oligo labeling, nick 
translation, end-labeling, or PCR amplification using a labeled nucleotide. Alternatively, sequences encoding a 

so "BREAST CANCER GENE" polypeptide can be cloned into a vector for the production of an mRN A probe. Such vectors 
are known in the art, are commercially available, and can be used to synthesize RNA probes in vitro by addition of 
labeled nucleotides and an appropriate RNA polymerase such as T7, T3, or SP6. These procedures can be conducted 
using a variety of commercially available kits (Amersham Pharmacia Biotech, Promega, and US Biochemical). Suitable 
reporter molecules or labels which can be used for ease of detection include radionuclides, enzymes, and fluorescent, 

55 chemiluminescent, or chromogenic agents, as well as substrates, cofactors, inhibitors, magnetic particles, and the like. 
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Expression and Purification of Polypeptides 

[0203] Host cells transformed with nucleotide sequences encoding a "BREAST CANCER GENE" polypeptide can 
be cultured under conditions suitable for the expression and recovery of the protein from cell culture. The polypeptide 

s produced by a transformed cell can be secreted or stored intracellular depending on the sequence and/or the vector 
used. As will be understood by those of skill in the art, expression vectors containing polynucleotides which encode 
"BREAST CANCER GENE" polypeptides can be designed to contain signal sequences which direct secretion of soluble 
"BREAST CANCER GENE" polypeptides through a prokaryotic or eukaryotic cell membrane or which direct the mem- 
brane insertion of membrane-bound "BREAST CANCER GENE" polypeptide. 

10 [0204] As discussed above, other constructions can be used to join a sequence encoding a "BREAST CANCER 
GENE" polypeptide to a nucleotide sequence encoding a polypeptide domain which will facilitate purification of soluble 
proteins. Such purification facilitating domains include', but are not limited to, metal chelating peptides such as histidine- 
tryptophan modules that allow purification on immobilized metals, protein A domains that allow purification on immo- 
bilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity purification system (Immunex Corp., 

*5 Seattle, Wash.). Inclusion of cleavable linker sequences such as those specific for Factor Xa or enterokinase (Invitro- 
gen, San Diego, CA) between the purification domain and the "BREAST CANCER GENE" polypeptide also can be 
used to facilitate purification. One such expression vector provides for expression of a fusion protein containing a 
"BREAST CANCER GENE" polypeptide and 6 histidine residues preceding a thioredoxin or an enterokinase cleavage 
site. The histidine residues facilitate purification by IMAC (immobilized metal ion affinity chromatography [Porath et al., 

20 1992, (112)], while the enterokinase cleavage site provides a means for purifying the "BREAST CANCER GENE" 
polypeptide from the fusion protein. Vectors which contain fusion proteins are disclosed in Kroll et al., (113). 

Chemical Synthesis 

25 [0205] Sequences encoding a "BREAST CANCER GENE" polypeptide can be synthesized, in whole or in part, using 
chemical methods well known in the art (see Caruthers et al., (114) and Horn et al.; (115). Alternatively, a "BREAST 
CANCER GENE" polypeptide itself can be produced using chemical methods to synthesize its amino acid sequence, 
such as by direct peptide synthesis using solid-phase techniques [Merrifield, 1963, (116) and Roberge et al., 1995, 
(117)]. Protein synthesis can be performed using manual techniques or by automation. Automated synthesis can be 

30 achieved, for example, using Applied Biosystems 431 A Peptide Synthesizer (Perkin Elmer). Optionally, fragments of 
"BREAST CANCER GENE" polypeptides can be separately synthesized and combined using chemical methods to 
produce a full-length molecule. 

[0206] The newly synthesized peptide can be substantially purified by preparative high performance liquid chroma- 
tography [Creighton, 1983, (118)]. The composition of a synthetic "BREAST CANCER GENE" polypeptide can be 
35 confirmed by amino acid analysis or sequencing (e.g., the Edman degradation procedure; see Creighton, (118). Addi- 
tionally, any portion of the amino acid sequence of the "BREAST CANCER GENE" polypeptide can be altered during 
direct synthesis and/or combined using chemical methods with sequences from other proteins to produce a variant 
polypeptide or a fusion protein. 

*o Production of Altered Polypeptides 

[0207] As will be understood by those of skill in the art, it may be advantageous to produce "BREAST CANCER 
GENE" polypeptide-encoding nucleotide sequences possessing non-natural occurring codons. For example, codons 
preferred by a particular prokaryotic or eukaryotic host can be selected to increase the rate of protein expression or to 
45 produce an RNA transcript having desirable properties, such as a half-life which is longer than that of a transcript 
generated from the naturally occurring sequence. 

[0208] The nucleotide sequences disclosed herein can be engineered using methods, generally known in the art to 
alter "BREAST CANCER GENE" polypeptide-encoding sequences for a variety of reasons, including but not limited 
to, alterations which modify the cloning, processing, and/or expression of the polypeptide or mRNA product. DNA 
50 shuffling by random fragmentation and PCR re-assembly of gene fragments and synthetic oligonucleotides can be 
used to engineer the nucleotide sequences. For example, site-directed mutagenesis can be used to insert new restric- 
tion sites, alter glycosylation patterns, change codon preference, produce splice variants, introduce mutations, and so 
forth. 

55 Predictive, Diagnostic and Prognostic Assays 

[0209] The present invention provides method for determining whether a subject is at risk for developing malignant 
neoplasia and breast cancer in particular by detecting one of the disclosed polynucleotide markers comprising any of 
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the polynucleotides sequences of the SEQ ID NO: 2 to 6, 8, 9, 11 to 16, 18, 19 or 21 to 26 or 53 to 75 and/or the 
polypeptide markers encoded thereby or polypeptide markers comprising any of the polypeptide sequences of the 
SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 44, 45 or 47 to 52 or 76 to 98 or at least 2 of the disclosed polynucleotides 
selected from SEQ ID NO: 1 to 26 and 53 to 75 or the at least 2 of the disclosed polypeptides selected from SEQ ID 

5 NO: 28 to 32 and 76 to 98 for malignant neoplasia and breast cancer in particular. 

[021 0J In clinical applications, biological samples can be screened for the presence and/or absence of the biomarkers 
identified herein. Such samples are for example needle biopsy cores, surgical resection samples, or body fluids like 
serum, thin needle nipple aspirates and urine. For example, these methods include obtaining a biopsy, which is op- 
tionally fractionated by cryostat sectioning to enrich diseases cells to about 80% of the total cell population. In certain 

10 embodiments, polynucleotides extracted from these samples may be amplified using techniques well known in the art. 
The expression levels of selected markers detected would be compared with statistically valid groups of diseased and 
healthy samples. 

[021 1] In one embodiment the diagnostic method comprises determining whether a subject has an abnormal mRNA 
and/or protein level of the disclosed markers, such as by Northern blot analysis, reverse transcription-polymerase chain 

15 reaction (RT-PCR), in situ hybridization, immunoprecipitation, Western blot hybridization, or immunohistochemistry. 
According to the method, cells are obtained from a subject and the levels of the disclosed biomarkers, protein or mRNA 
level, is determined and compared to the level of these markers in a healthy subject. An abnormal level of the biomarker 
polypeptide or mRNA levels is likely to be indicative of malignant neoplasia such as breast cancer. 
[0212] In another embodiment the diagnostic method comprises determining whether a subject has an abnormal 

20 DNA content of said genes or said genomic loci, such as by Southern blot analysis, dot blot analysis, fluorescence or 
colorimetric In Situ hybridization, comparative genomic hybridization, genotpying by VNTR, STS-PCR or quantitative 
PCR. In general these assays comprise the usage of probes from representative genomic regions. The probes contain 
at least parts of said genomic regions or sequences complementary or analogous to said regions. In particular intra- 
or intergenic regions of said genes or genomic regions. The probes can consist of nucleotide sequences or sequences 

25 of analogous functions (e.g. PNAs, Morpholino oligomers) being able to bind to target regions by hybridization. In 
general genomic regions being altered in said patient samples are compared with unaffected control samples (normal 
tissue from the same or different patients, surrounding unaffected tissue, peripheral blood) or with genomic regions of 
the same sample that don't have said alterations and can therefore serve as internal controls. In a preferred embodiment 
regions located on the same chromosome are used. Alternatively, gonosomal regions and /or regions with defined 

30 varying amount in the sample are used. In one favored embodiment the DNA content, structure, composition or mod- 
ification is compared that lie within distinct genomic regions. Especially favored are methods that detect the DNA 
content of said samples, where the amount of target regions are altered by amplification and or deletions. In another 
embodiment the target regions are analyzed for the presence of polymorphisms (e.g. Single Nucleotide Polymorphisms 
or mutations) that affect or predispose the cells in said samples with regard to clinical aspects, being of diagnostic, 

35 prognostic or therapeutic value. Preferably, the identification of sequence variations is used to define haplotypes that 
result in characteristic behavior of said samples with said clinical aspects. 

[0213] The following examples of genes in 17q12-21.2 are offered by way of illustration, not byway of limitation. 
[0214] One embodiment of the invention is a method for the prediction, diagnosis or prognosis of malignant neoplasia 
by the detection of at Ieast10, at least 5, or at least 4, or at least 3 and more preferably at least 2 markers whereby the 
io markers are genes and fragments thereof and/or genomic nucleic acid sequences that are located on one chromosomal 
region which is altered in malignant neoplasia. 

[0215] One further embodiment of the invention is method for the prediction, diagnosis or prognosis of malignant 
neoplasia by the detection of at least 10, at least 5, or at least 4, or at least 3 and more preferably at least 2 markers 
whereby the markers (a) are genes and fragments thereof and/or genomic nucleic acid sequences that are located on 
45 one or more chromosomal region(s) which is/are altered in malignant neoplasia and (b) functionally interact as (i) 
receptor and ligand or (ii) members of the same signal transduction pathway or (iii)members of synergistic signal 
transduction pathways or (iv) members of antagonistic signal transduction pathways or (v) transcription factor and 
transcription factor binding site. 

[0216] In one embodiment, the method for the prediction, diagnosis or prognosis of malignant neoplasia and breast 
50 cancer in particular is done by the detection of: 

(a) polynucleotide selected from the polynucleotides of the SEQ ID NO: 2 to 6, 8, 9, 11 to 16, 18, 19, 21 to 26 or 
53 to 75; 

55 (b) a polynucleotide which hybridizes under stringent conditions to a polynucleotide specified in (a) encoding a 

polypeptide exhibiting the same biological function as specified for the respective sequence in Table 2 or 3; 

(c) a polynucleotide the sequence of which deviates from the polynucleotide specified in (a) and (b) due to the 
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generation of the genetic code encoding a polypeptide exhibiting the same biological function as specified for the 
respective sequence in Table 2 or 3; 

(d) a polynucleotide which represents a specific fragment, derivative or allelic variation of a polynucleotide se- . 
5 quence specified in (a) to (c); 

in a biological sample comprising the following steps: hybridizing any polynucleotide or analogous oligomer specified 
in (a) to (do) to a polynucleotide material of a biological sample, thereby forming a hybridization complex; and detecting 
said hybridization complex. 

10 [0217] In another embodiment the method for the prediction, diagnosis or prognosis of malignant neoplasia is done 
as just described but, wherein before hybridization, the polynucleotide material of the biological sample is amplified. 
[0218] In another embodiment the method for. the diagnosis or prognosis of malignant neoplasia and breast cancer 
in particular is done by the detection of: 

*5 (a) a polynucleotide selected from the polynucleotides of the SEQ ID NO: 2 to 6. 8, 9, 11 to 16, 18, 19, 21 to 26 

or 53 to 75; 

(b) a polynucleotide which hybridizes under stringent conditions to a polynucleotide specified in (a) encoding a 
polypeptide exhibiting the same biological function as specified for the respective sequence in Table 2 or 3; 

20 

(c) a polynucleotide the sequence of which deviates from the polynucleotide specified in (a) and (b) due to the 
generation of the genetic code encoding a polypeptide exhibiting the same biological function as specified for the 
respective sequence in Table 2 or 3; 

25 • (d) a polynucleotide which represents a specific fragment, derivative or allelic variation of a polynucleotide se- 
quence specified in (a) to (c); 

(e) a polypeptide encoded by a polynucleotide sequence specified in (a) to (d) 

30 (f) a polypeptide comprising any polypeptide of SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 44, 45, 47 to 52 or 76 to 98; 

comprising the steps of contacting a biological sample with a reagent which specifically interacts with the polynucleotide 
specified in (a) to (d) or the polypeptide specified in (e). 

35 DNA array technology 

[0219] In one embodiment, the present Invention also provides a method wherein polynucleotide probes are immo- 
bilized an a DNA chip in an organized array. Oligonucleotides can be bound to a solid Support by a variety of processes, 
. including lithography. For example a chip can hold up to 4100,00 oligonucleotides (GeneChip, Affymetrix). The present 

40 invention provides significant advantages over the available tests for malignant neoplasia, such as breast cancer, 
because it increases the reliability of the test by providing an array of polynucleotide markers an a single chip. 
[0220] The method includes obtaining a biopsy of an affected person, which is optionally fractionated by cryostat 
sectioning to enrich diseased cells to about 80% of the total cell population and the use of body fluids such as serum 
or urine, serum or cell containing liquids (e.g. derived from fine needle aspirates). The DNA or RNA is then extracted, 

45 amplified, and analyzed with a DNA chip to determine the presence of absence of the marker polynucleotide sequences. 
In one embodiment, the polynucleotide probes are spotted onto a substrate in a two-dimensional matrix or array, sam- 
ples of polynucleotides can be labeled and then hybridized to the probes. Double-stranded polynucleotides, comprising 
the labeled sample polynucleotides bound to probe polynucleotides, can be detected once the unbound portion of the 
sample is washed away. 

50 [0221] The probe polynucleotides can be spotted an substrates including glass, nitrocellulose, etc. The probes can 
be bound to the Substrate by either covalent bonds or by non-specific interactions, such as hydrophobic interactions. 
The sample polynucleotides can be labeled using radioactive labels, fluorophores, chromophores, etc. Techniques for 
constructing arrays and methods of using these arrays are described in EP 0 799 897; WO 97/29212; WO 97/27317; 
EP 0 785 280; WO 97/02357; U.S. Pat. No. 5,593,839; U.S. Pat. No. 5,578,832; EP 0 728 520; U.S. Pat. No. 5,599,695; 

55 EP 0 721 016; U.S. Pat. No. 5,556,752; WO 95/22058; and U.S. Pat. No. 5,631,734. 

[0222] Further, arrays can be used to examine differential expression of genes and can be used to determine gene 
function. For example, arrays of the instant polynucleotide sequences can be used to determine if any of the polynu- 
cleotide sequences are differentially expressed between normal cells and diseased cells, for example. High expression 



38 



EP 1 365 034 A2 



of a particular message in a diseased sample, which is not observed in a corresponding normal sample, can indicate 
a breast cancer specific protein. 

[0223] Accordingly, in one aspect, the invention provides probes and primers that are specific to the unique polynu- 
cleotide markers disclosed herein. 
5 [0224] In one embodiment, the method comprises using a polynucleotide probe to determine the presence of ma- 
lignant or breast cancer cells in particular in a tissue from a patient. Specifically, the method comprises: 

1) providing a polynucleotide probe comprising a nucleotide sequence at least 1 2 nucleotides in length, preferably 
at least 15 nucleotides, more preferably, 25 nucleotides, and most preferably at least 40 nucleotides, and up to all 

10 or nearly all of the coding sequence which is complementary to a portion of the coding sequence of a polynucleotide 

selected from the polynucleotides of SEQ ID NO: 1 to 26 and 53 to 75 or a sequence complementary thereto and is 

2) differentially expressed in malignant neoplasia, such as breast cancer; 
15 3) obtaining a tissue sample from a patient with malignant neoplasia; 

4) providing a second tissue sample from a patient with no malignant neoplasia; 

5) contacting the polynucleotide probe under stringent conditions with RNA of each of said first and second tissue 
20 samples (e.g., in a Northern blot or in situ hybridization assay); and 

6) comparing (a) the amount of hybridization of the probe with RNA of the first tissue sample, with (b) the amount 
of hybridization of the probe with RNA of the second tissue sample; 

25 wherein a statistically significant difference in the amount of hybridization with the RNA of the first tissue sample as 
compared to the amount of hybridization with the RNA of the second tissue sample is indicative of malignant neoplasia 
and breast cancer in particular in the first tissue sample. 

Data analysis methods 

30 

[0225] Comparison of the expression levels of one or more "BREAST CANCER GENES" with reference expression 
levels, e.g., expression levels in diseased cells of breast cancer or in normal counterpart cells, is preferably conducted 
using computer systems. In one embodiment, expression levels are obtained in two cells and these two sets of ex- 
pression levels are introduced into a computer system for comparison. In a preferred embodiment, one set of expression 
35 levels is entered into a computer system for comparison with values that are already present in the computer system, 
or in computer-readable form that is then entered into the computer system. 

[0226] In one embodiment, the invention provides a computer readable form of the gene expression profile data of 
the invention, or of values corresponding to the level of expression of at least one "BREAST CANCER GENE" in a 
diseased cell. The values can be mRNA expression levels obtained from experiments, e.g., microarray analysis. The 
values can also be mRNA levels normalised relative to a reference gene whose expression is constant in numerous 
cells under numerous conditions, e.g., GAPDH. In other embodiments, the values in the computer are ratios of, or 
differences between, normalized or non-normalized mRNA levels in different samples. 

[0227] The gene expression profile data can be in the form of a table, such as an Excel table. The data can be alone, 
or it can be part of a larger database, e.g., comprising other expression profiles. For example, the expression profile 

45 data of the invention can be part of a public database. The computer readable form can be in a computer. In another 
embodiment, the invention provides a computer displaying the gene expression profile data. 
[0228] In one embodiment, the invention provides a method for determining the similarity between the level of ex- 
pression of one or more "BREAST CANCER GENES" in a first cell, e.g., a cell of a subject, and that in a second cell, 
comprising obtaining the level of expression of one or more "BREAST CANCER GENES" in a first cell and entering 

50 these values into a computer comprising a database including records comprising values corresponding to levels of 
expression of one or more "BREAST CANCER GENES" in a second cell, and processor instructions, e.g., a user 
interface, capable of receiving a selection of one or more values for comparison purposes with data that is stored in 
the computer. The computer may further comprise a means for converting the comparison data into a diagram or chart 
or other type of output. 

55 [0229] In another embodiment, values representing expression levels of "BREAST CANCER GENES" are entered 
into a computer system, comprising one or more databases with reference expression levels obtained from more than 
one cell. For example, the computer comprises expression data of diseased and normal cells. Instructions are provided 
to the computer, and the computer is capable of comparing the data entered with the data in the computer to determine 
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whether the data entered is more similar to that of a normal cell or of a diseased cell. 

[0230J In another embodiment, the computer comprises values of expression levels in cells of subjects at different 
stages of breast cancer, and the computer is capable of comparing expression data entered into the computer with 
the data stored, and produce results indicating to which of the expression profiles in the computer, the one entered is 

5 most similar, such as to determine the stage of breast cancer in the subject. 

[0231] In yet another embodiment, the reference expression profiles in the computer are expression profiles from 
cells of breast cancer of one or more subjects, which cells are treated in vivo or in vitro with a drug used for therapy 
of breast cancer. Upon entering of expression data of a cell of a subject treated in vitro or in vivo with the drug, the 
computer is instructed to compare the data entered to the data in the computer, and to provide results indicating whether 

10 the expression data input into the computer are more similar to those of a cell of a subject that is responsive to the 
drug or more similar to those of a cell of a subject that is not responsive to the drug. Thus, the results indicate whether 
the subject is likely to respond to the treatment with the drug or unlikely to respond to it. 

[0232] In one embodiment, the invention provides a system that comprises a means for receiving gene expression 
data for one or a plurality of genes; a means for comparing the gene expression data from each of said one or plurality 
15 of genes to a common reference frame; and a means for presenting the results of the comparison. This system may 
further comprise a means for clustering the data. - 

[0233] In another embodiment, the invention provides a computer program for analyzing gene expression data com- 
prising (i) a computer code that receives as input gene expression data for a plurality of genes and (ii) a computer code 
that compares said gene expression data from each of said plurality of genes to a common reference frame. 

20 [0234] The invention also provides a machine-readable or computer-readable medium including program instructions 
for performing the following steps: (i) comparing a plurality of values corresponding to expression levels of one or more 
genes characteristic of breast cancer in a query cell with a database including records comprising reference expression 
or expression profile data of one or more reference cells and an annotation of the type of cell; and (ii) indicating to 
which cell the query cell is most similar based on similarities of expression profiles. The reference cells can be cells 

25 from subjects at different stages of breast cancer. The reference ceils can also be cells from subjects responding or 
not responding to a particular drug treatment and optionally incubated in vitro or in vivo with the drug. 
[0235] The reference cells may also be cells from subjects responding or not responding to several different treat- 
ments, and the computer system indicates a preferred treatment for the subject. Accordingly, the invention provides a 
method for selecting a therapy for a. patient having breast cancer, the method comprising: (i) providing the level of 

30 expression of one or more genes characteristic of breast cancer in a diseased cell of the patient; (ii) providing a plurality 
of reference profiles, each associated with a therapy, wherein the subject expression profile and each reference profile 
has a plurality of values, each value representing the level of expression of a gene characteristic of breast cancer; and 
(iii) selecting the reference profile most similar to the subject expression profile, to thereby select a therapy for said 
patient. In a preferred embodiment step (iii) is performed by a computer. The most similar reference profile may be 

35 selected by weighing a comparison value of the plurality using a weight value associated with the corresponding ex- 
pression data. 

[0236] The relative abundance of an mRNA in two biological samples can be scored as a perturbation and its mag- 
nitude determined (i.e., the abundance is different in the two sources of mRNA tested), or as not perturbed (i.e., the 
relative abundance is the same). In various embodiments, a difference between the two sources of RNA of at least a 
40 factor of about 25% (RNA from one source is 25% more abundant in one source than the other source), more usually 
about 50%, even more often by a factor of about 2 (twice as abundant), 3 (three times as abundant) or 5 (five times 
as abundant) is scored as a perturbation. Perturbations can be used by a computer for calculating and expression 
comparisons. 

[0237] Preferably, in addition to identifying a perturbation as positive or negative, it is advantageous to determine 
<5 the magnitude of the perturbation. This can be carried out, as noted above, by calculating the ratio of the emission of 
the two fluorophores used for differential labeling, or by analogous methods that will be readily apparent to those of 
skill in the art. 

[0238] The computer readable medium may further comprise a pointer to a descriptor of a stage of breast cancer or 
to a treatment for breast cancer. 

so [0239] In operation, the means for receiving gene expression data, the means for comparing the gene expression 
data, the means for presenting, the means for normalizing, and the means for clustering within the context of the 
systems of the present invention can involve a programmed computer with the respective functionalities described 
herein, implemented in hardware or hardware and software; a logic circuit or other component of a programmed com- 
puter that performs the operations specifically identified herein, dictated by a computer program; or a computer memory 

55 encoded with executable instructions representing a computer program that can cause a computer to function in the 
particular fashion described herein. 

[0240] Those skilled in the art will understand that the systems and methods of the present invention may be applied 
to a variety of systems, including IBM-compatible personal computers running MS-DOS or Microsoft Windows. 
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[0241] The computer may have internal components linked to external components. The internal components may 
include a processor element interconnected with a main memory. The computer system can be an Intel Pentiums- 
based processor of 200 MHz or greater clock rate and with 32 MB or more of main memory. The external component 
may comprise a mass storage, which can be one or more hard disks (which are typically packaged together with the 
processor and memory). Such hard disks are typically of 1 GB or greater storage capacity. Other external components 
include a user interface device, which can be a monitor, together with an inputing device, which can be a "mouse", or 
other graphic input devices, and/or a keyboard. A printing device can also be attached to the computer. 
[0242] Typically, the computer system is also linked to a network link, which can be part of an Ethernet link to other 
local computer systems, remote computer systems, or wide area communication networks, such as the Internet. This 
network link allows the computer system to share data and processing tasks with other computer systems. 
[0243] Loaded into memory during operation of this system are several software components, which are both stand- 
ard in the art and special to the instant invention. These software components collectively cause the computer system 
to function according to the methods of this invention. These software components are typically stored on a mass 
storage. A software component represents the operating system, which is responsible for managing the computer 
system and its network interconnections. This operating system can be, for example, of the Microsoft Windows 1 family, 
such as Windows 95, Windows 98, or Windows NT. A software component represents common languages and functions 
conveniently present on this system to assist programs implementing the methods specific to this invention. Many high 
or low level computer languages can be used to program the analytic methods of this invention. Instructions can be 
interpreted during run-time or compiled. Preferred languages include C/C++, and JAVA®. Most preferably, the methods 
of this invention are programmed in mathematical software packages which allow symbolic entry of equations and 
high-level specification of processing, including algorithms to be used, thereby freeing a user of the need to procedurally 
program individual equations or algorithms. Such packages include Matlab from Mathworks (Natick, Mass.), Mathe- 
matica from Wolfram Research (Champaign, III.), or S-Plus from Math Soft (Cambridge, Mass.). Accordingly, a software 
component represents the analytic methods of this invention as programmed in a procedural language or symbolic 
package. In a preferred embodiment, the computer system also contains a database comprising values representing 
levels of expression of one or more genes characteristic of breast cancer. The database may contain one or more 
expression profiles of genes characteristic of breast cancer in different cells. 

[0244] In an exemplary implementation, to practice the methods of the present invention, a user first loads expression 
profile data into the computer system. These data can be directly entered by the user from a monitor and keyboard, 
or from other computer systems linked by a network connection, or on removable storage media such as a CD-ROM 
or floppy disk or through the network. Next the user causes execution of expression profile analysis software which 
performs the steps of comparing and, e.g., clustering co-varying genes into groups of genes. 
[0245] In another exemplary implementation, expression profiles are compared using a method described in U.S. 
Patent No. 6,203,987. A user first loads expression profile data into the computer system. Geneset profile definitions 
are loaded into the memory from the storage media or from a remote computer, preferably from a dynamic geneset 
database system, through the network. Next the user causes execution of projection software which performs the steps 
of converting expression profile to projected expression profiles. The projected expression profiles are then displayed. 
[0246] In yet another exemplary implementation, a user first leads a projected profile into the memory. The user then 
causes the loading of a reference profile into the memory. Next, the user causes the execution of comparison software 
which performs the steps of objectively comparing the profiles. 

Detection of variant polynucleotide sequence 

[0247] In yet another embodiment, the invention provides methods for determining whether a subject is at risk for 
developing a disease, such as a predisposition to develop malignant neoplasia, for example breast cancer, associated 
. with an aberrant activity of any one of the polypeptides encoded by any of the polynucleotides of the SEQ ID NO: 1 to 
26 or 53 to 75, wherein the aberrant activity of the polypeptide is characterized by detecting the presence or absence 
of a genetic lesion characterized by at least one of these: 

(i) an alteration affecting the integrity of a gene encoding a marker polypeptides, or 

(ii) the misexpression of the encoding polynucleotide. 

[0248] To illustrate, such genetic lesions can be detected by ascertaining the existence of at least one of these: 

I. a deletion of one or more nucleotides from the polynucleotide sequence 

II. an addition of one or more nucleotides to the polynucleotide sequence 
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III. a substitution of one or more nucleotides of the polynucleotide sequence 

IV. a gross chromosomal rearrangement of the polynucleotide sequence 

V. a gross alteration in the level of a messenger RNA transcript of the polynucleotide sequence 

VI. aberrant modification of the polynucleotide sequence, such as of the methylation pattern of the genomic DNA 

VII. the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene 

VIII. a non-wild type level of the marker polypeptide 

IX. allelic loss of the gene 

X. allelic gain of the gene 

XI. inappropriate post-translational modification of the marker polypeptide 

[0249] The present Invention provides assay techniques for detecting mutations in the encoding polynucleotide se- 
quence. These methods include, but are not limited to, methods involving sequence analysis, Southern blot hybridi- 
zation, restriction enzyme site mapping, and methods involving detection of absence of nucleotide pairing . between 
the polynucleotide to be analyzed and a probe. 

[0250] Specific diseases or disorders, e.g., genetic diseases or disorders, are associated with specific allelic variants 
of polymorphic regions of certain genes, which do not necessarily encode a mutated protein. Thus, the presence of a 
specific allelic variant of a polymorphic region of a gene in a subject can render the subject susceptible to developing 
a specific disease or disorder. Polymorphic regions in genes, can be identified, by determining the nucleotide sequence 
of genes in populations of individuals. If a polymorphic region is identified, then the link with a specific disease can be 
determined by studying specific populations of individuals, e.g. individuals which developed a specific disease, such 
as breast cancer. A polymorphic region can be located in any region of a gene, e.g., exons, in coding or non coding 
regions of exons, introns, and promoter region. 

[0251] In an exemplary embodiment, there is provided a polynucleotide composition comprising a polynucleotide 
probe including a region of nucleotide sequence which is capable of hybridising to a sense or antisense sequence of 
a gene or naturally occurring mutants thereof, or 5' or 3' flanking sequences or intronic sequences naturally associated 
with the subject genes or naturally occurring mutants thereof. The polynucleotide of a ceil is rendered accessible for 
hybridization, the probe is contacted with the polynucleotide of the sample, and the hybridization of the probe to the 
sample polynucleotide is detected. Such techniques can be used to detect lesions or allelic variants at either the ge- 
nomic or mRNA level, including deletions, substitutions, etc., as well as to determine mRNA transcript levels. 
[0252] A preferred detection method is allele specific hybridization using probes overlapping the mutation or poly- 
morphic site and having about 5, 1 0, 20, 25, or 30 nucleotides around the mutation or polymorphic region. In a preferred 
embodiment of the invention, several probes capable of hybridising specifically to allelic variants are attached to a solid 
phase support, e.g., a "chip". Mutation detection analysis using these chips comprising oligonucleotides, also termed 
"DNA probe arrays" is described e.g., in Cronin et al. (119). In one embodiment, a chip comprises all the allelic variants 
of at least one polymorphic region of a gene. The solid phase support is then contacted with a test polynucleotide and 
hybridization to the specific probes is detected. Accordingly, the identity of numerous allelic variants of one or more 
genes can be identified in a simple hybridization experiment. 

[0253] In certain embodiments, detection of the lesion comprises utilizing the probe/primer in a polymerase chain 
reaction (PCR) (see, e.g. U.S. Patent Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or, alter- 
natively, in a ligase chain reaction (LCR) [Landegran et al., 1988, (120) and Nakazawa et al., 1994 (121)], the latter of 
which can be particularly useful for detecting point mutations in the gene; Abravaya et al., 1995 .(122)]. In a merely 
illustrative embodiment, the method includes the steps of (i) collecting a sample of cells from a patient, (ii) isolating 
polynucleotide (e.g., genomic, mRNA or both) from the cells of the sample, (iii) contacting the polynucleotide sample 
with one or more primers which specifically hybridize to a polynucleotide sequence under conditions such that hybrid- 
ization and amplification of the polynucleotide (if present) occurs, and (iv) detecting the presence or absence of an 
amplification product, or detecting the size of the amplification product and comparing the length to a control sample. 
It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with 
any of the techniques used for detecting mutations described herein. 

[0254] Alternative amplification methods include: self sustained sequence replication [Guatelli, J.C. et al„ 1990, 
(123)], transcriptional amplification system [Kwoh, D.Y. et al., 1989, (124)], Q-Beta replicase [Lizardi, P.M. et al., 1988 , 
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(125)], or any other polynucleotide amplification method, followed by the detection of the. amplified molecules using 
techniques well known to those of skill in the art. These detection schemes are especially useful for the detection of 
polynucleotide molecules if such molecules are present in very low numbers. 

[0255] In a preferred embodiment of the subject assay, mutations in, or allelic variants, of a gene from a sample cell 
5 are identified by alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, 
amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined 
by gel electrophoresis. Moreover; the use of sequence specific ribozymes (see, for example, U.S. Patent No. 5,498,531 ) 
can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site. 

10 in situ hybridization 

[0256] In one aspect, the method comprises in situ hybridization with a probe derived from a given marker polynu- 
cleotide, which sequence is selected from any of the polynucleotide sequences of the SEQ ID NO: 1 to 9, or 11 to 19 
or 21 to 26 and 53 to 75 or a sequence complementary thereto. The method comprises contacting the labeled hybrid- 
15 ization probe with a sample of a given type of tissue from a patient potentially having malignant neoplasia and breast 
cancer in particular as well as normal tissue from a person with no malignant neoplasia, and determining whether the 
probe labels tissue of the patient to a degree significantly different (e.g., by at least a factor of two, or at least a factor 
of five, or at least a factor of twenty, or at least a factor of fifty) than the degree to which normal tissue is labelled. 

20 Polypeptide detection 

[0257] The subject invention further provides a method of determining whether a cell sample obtained from a subject 
possesses an abnormal amount of marker polypeptide which comprises (a) obtaining a cell sample from the subject, 
(b) quantitatively determining the amount of the marker polypeptide in the sample so obtained, and (c) comparing the 
25 amount of the marker polypeptide so determined with a known standard, so as to thereby determine whether the cell 
sample obtained from the subject possesses an abnormal amount of the marker polypeptide. Such marker polypeptides 
may be detected by immunohistochemical assays, dot-blot assays, ELISA and the like. 

Antibodies 

30 

[0258] Any type of antibody known in the art can be generated to bind specifically to an epitope of a "BREAST 
CANCER GENE" polypeptide. An antibody as used herein includes intact immunoglobulin molecules, as well as frag- 
ments thereof, such as Fab, F(ab) 2 , and Fv, which are capable of binding an epitope of a "BREAST CANCER GENE" 
polypeptide. Typically, at least 6, 8, 1 0, or 1 2 contiguous amino acids are required to form an epitope. However, epitopes 

35 which involve non-contiguous amino acids may require more, e.g., at least 15, 25, or 50 amino acids, 

[0259] An antibody which specifically binds to an epitope of a "BREAST CANCER GENE" polypeptide can be used 
therapeutically, as well as in immunochemical assays, such as Western blots, ELISAs, radioimmunoassays, immuno- 
histochemical assays, immunoprecipitations, or other immunochemical assays known in the art. Various immu- 
noassays can be used to identify antibodies having the desired specificity. Numerous protocols for competitive binding 

*Q or immunoradiometric assays are well known in the art. Such immunoassays typically involve the measurement of 
complex formation between an immunogen and an antibody which specifically binds to the immunogen. 
[0260] Typically, an antibody which specifically binds to a "BREAST CANCER GENE" polypeptide provides a detec- 
tion signal at least 5-, 10-, or 20-fold higher than a detection signal provided with other proteins when used in an 
immunochemical assay. Preferably, antibodies which specifically bind to "BREAST CANCER GENE" polypeptides do 

4 5 not detect other proteins in immunochemical assays and can immunoprecipitate a "BREAST CANCER GENE" polypep- 
tide from solution. 

[0261] "BREAST CANCER GENE" polypeptides can be used to immunize a mammal, such as a mouse, rat, rabbit, 
guinea pig, monkey, or human,, to produce polyclonal antibodies. If desired, a "BREAST CANCER GENE" polypeptide 
can be conjugated to a carrier protein, such as bovine serum albumin, thyroglobulin, and keyhole limpet hemocyanin. 
so Depending on the host species, various adjuvants can be used to increase the immunological response. Such adjuvants 
include, but are not limited to, Freund's adjuvant, mineral gels (e.g., aluminum hydroxide), and surface active sub- 
stances (e.g. lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, and dini- 
trophenol). Among adjuvants used in humans, BCG (bacilli Calmette-Guerin) and Corynebacterium parvum are espe- 
cially useful. 

55 [0262] Monoclonal antibodies which specifically bind to a "BREAST CANCER GENE" polypeptide can be prepared 
using any technique which provides for the production of antibody molecules by continuous cell lines in culture. These 
techniques include, but are not limited to, the hybridoma technique, the human B cell hybridoma technique, and the 
EBV hybridoma technique [Kohler et al., 1985, (136); Kozbor et aL, 1985, (137); Cote et al., 1983, (138) and Cole et 
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al.. 1984, (139)]. 

[0263] In addition, techniques developed for the production of chimeric antibodies, the splicing of mouse antibody 
genes to human antibody genes to obtain a molecule with appropriate antigen specificity and biological activity, can 
be used [Morrison et al., 1984, (140); Neuberger et al., 1984, (141); Takeda et al., 1985, (142)]. Monoclonal and other 
antibodies also can be humanized to prevent a patient from mounting an immune response against the antibody when 
it is used therapeutically. Such antibodies may be sufficiently similar in sequence to human antibodies' to be used 
directly in therapy or may require alteration of a few key residues. Sequence differences between rodent antibodies 
and human sequences can be minimized by replacing residues which differ from those in the human sequences by 
site directed mutagenesis . of individual residues or by grating of entire complementarity determining regions. Alterna- 
tively, humanized antibodies can be produced using recombinant methods, as described in GB2188638B. Antibodies 
which specifically bind to a "BREAST CANCER GENE" polypeptide can contain antigen binding sites which are either 
partially or fully humanized, as disclosed in U.S. Patent 5,565,332. 

[0264] Alternatively, techniques described for the production of single chain antibodies can be adapted using methods 
known in the art to produce single chain antibodies which specifically bind to "BREAST CANCER GENE" polypeptides. 
Antibodies with related specificity, but of distinct idiotypic composition, can be generated by chain shuffling from random 
combinatorial immunoglobulin libraries [Burton, 1991, (143)]. 

[0265] Single-chain antibodies also can be constructed using a DNA amplification method, such as PCR, using 
hybridoma cDNA as a template [Thirion et al., 1996, (144)]. Single-chain antibodies can be mono- or bispecific, and 
can be bivalent or tetravalent. Construction of tetravalent, bispecific single-chain antibodies is taught, for example, in 
Coloma & Morrison, (145). Construction of bivalent, bispecific single-chain antibodies is taught in Mallender & Voss, 
(146). 

[0266] A nucleotide sequence encoding a single-chain antibody can be constructed using manual or automated 
nucleotide synthesis, cloned into an expression construct using standard recombinant DNA methods, and introduced 
into a cell to express the coding sequence, as described below. Alternatively, single-chain antibodies can be produced 
directly using, for example, filamentous phage technology [Verhaar et al., 1995, (147); Nicholls et al., 1993, (148)]. 
[0267] Antibodies which specifically bind to "BREAST CANCER GENE" polypeptides also can be produced by in- 
ducing in vivo production in the lymphocyte population or by screening immunoglobulin libraries or panels of highly 
specific binding reagents as disclosed in the literature [Orlandi et al., 1989, (149) and Winter et al., 1991, (150)]. 
[0268] Other types of antibodies can be constructed and used therapeutically in methods of the invention. For ex- 
ample, chimeric antibodies can be constructed as disclosed in WO 93/03151. Binding proteins which are derived from 
immunoglobulins and which ate multivalent and multispecific, such as the antibodies described in WO 94/13804, also 
can be prepared. 

[0269] Antibodies according to the invention can be purified by methods well known in the art. For example, antibodies 
can be affinity purified by passage over a column to which a "BREAST CANCER GENE" polypeptide is bound. The 
bound antibodies can then be eluted from the column using a buffer with a high salt concentration. 
[0270] Immunoassays are commonly used to quantify the levels of proteins in cell samples, and many other immu- 
noassay techniques are known in the art. The invention is not limited to a particular assay procedure, and therefore is 
intended to include both homogeneous and heterogeneous procedures. Exemplary immunoassays which can be con- 
ducted according to the invention include fluorescence polarisation immunoassay (FPIA), fluorescence immunoassay 
(FIA), enzyme immunoassay (EIA), nephelometric inhibition immunoassay (NIA), enzyme linked immunosorbent assay 
(ELISA), and radioimmunoassay (RIA). An indicator moiety, or label group, can be attached to the subject antibodies 
and is selected so as to meet the needs of various uses of the method which are often dictated by the availability of 
assay equipment and compatible immunoassay procedures. General techniques to be used in performing the various 
immunoassays noted above are known to those of ordinary skill in the art. 

[0271] In another embodiment, the level of at least one product encoded by any of the polynucleotide sequences of 
the SEQ I D NO: 2 to 6, 8, 9, 1 1 to 1 6, 1 8, 1 9 or 2 1 to 26 or 53 to 75 or of at least 2 products encoded by a polynucleotide 
selected from SEQ ID NO: 1 to 26 and 53 to 75 or a sequence complementary thereto, in a biological fluid (e.g., blood 
or urine) of a patient may be determined as a way of monitoring the level of expression of the marker polynucleotide 
sequence in cells of that patient. Such a method would include the steps of obtaining a sample of a biological fluid 
from the patient, contacting the sample (or proteins from the sample) with an antibody specific for a encoded marker 
polypeptide, and determining the amount of immune complex formation by the antibody, with the amount of immune 
complex formation being indicative of the level of the marker encoded product in the sample. This determination is 
particularly instructive when compared to the amount of immune complex formation by the same antibody in a control 
sample taken from a normal individual or in one or more samples previously or subsequently obtained from the same 
person. 

[0272] In another embodiment, the method can be used to determine the amount of marker polypeptide present in • 
a cell, which in turn can be correlated with progression of the disorder, e.g., plaque formation. The level of the marker 
polypeptide can be used predictively to evaluate whether a sample of cells contains cells which are, or are predisposed 
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towards becoming, plaque associated cells. The observation of marker polypeptide level can be utilized in decisions 
regarding, e.g., the use of more stringent therapies. 

[0273] As set out above, one aspect of the present invention relates to diagnostic assays for determining, in the 
context of cells isolated from a patient, if the level of a marker polypeptide is significantly reduced in the sample cells. 

5 The term "significantly reduced" refers to a cell phenotype wherein the cell possesses a reduced cellular amount of 
the marker polypeptide relative to a normal cell of similar tissue origin. For example, a cell may have less than about 
50%, 25%, 10%, or 5% of the marker polypeptide that a normal control cell. In particular, the assay evaluates the level 
of marker polypeptide in the test cells, and, preferably, compares the measured level with marker polypeptide detected 
in at least one control cell, e.g., a normal cell and/or a transformed cell of known phenotype. 

10 [0274] Of particular importance to the subject invention is the ability to quantify the level of marker polypeptide as 
determined by the number of cells associated with a normal or abnormal marker polypeptide level. The number of cells 
with a particular marker polypeptide phenotype may then be correlated with patient prognosis. In one embodiment of 
the invention, the marker polypeptide phenotype of the lesion is determined as a percentage of cells in a biopsy which 
are found to have abnormally high/low levels of the marker polypeptide. Such expression may be detected by immu- 

15 nohistochemical assays, dot-blot assays, ELISA and the like. 

Immunohistochemistry 

[0275] Where tissue samples are employed, immunohistochemical staining may be used to determine the number 
20 of cells having the marker polypeptide phenotype. For such staining, a multiblock of tissue is taken from the biopsy or 
other tissue sample and subjected to proteolytic hydrolysis, employing such agents as protease K or pepsin. In certain 
embodiments, it may be desirable to isolate a nuclear fraction from the sample cells and detect the level of the marker 
polypeptide in the nuclear fraction. 

[0276] The tissues samples are fixed by treatment with a reagent such as formalin, glutaraldehyde, methanol, or the 
25 like. The samples are then incubated with an antibody, preferably a monoclonal antibody, with binding specificity for 
the marker polypeptides. This antibody may be conjugated to a Label for subsequent detection of binding, samples 
are incubated for a time Sufficient for formation of the immuno-complexes. Binding of the antibody is then detected by 
virtue of a Label conjugated to this antibody. Where the antibody is unlabeiled, a second labeled antibody may be 
employed, e.g., which is specific for the isotype of the anti-marker polypeptide antibody. Examples of labels which may 
30 be employed include radionuclides, fluorescence, chemiluminescence, and enzymes. 

[0277] Where enzymes are employed, the Substrate for the enzyme may be added to the samples to provide a 
colored or fluorescent product. Examples of suitable enzymes for use in conjugates include horseradish peroxidase, 
alkaline phosphatase, malate dehydrogenase and the like. Where not commercially available, such antibody-enzyme 
conjugates are readily produced by techniques known to those skilled in the art. 
35 [0278] In one embodiment, the assay is performed as a dot blot assay. The dot blot assay finds particular application 
where tissue samples are employed as it allows determination of the average amount of the marker polypeptide as- 
sociated with a Single cell by correlating the amount of marker polypeptide in a cell-free extract produced from a 
predetermined number of cells. 

[0279] In yet another embodiment, the invention contemplates using one or more antibodies which are generated 
40 against one or more of the marker polypeptides of this invention, which polypeptides are encoded by any of the poly- 
nucleotide sequences of the SEQ ID NO: 1 to 26 or 53 to 75. Such a panel of antibodies may be used as a reliable 
diagnostic probe for breast cancer. The assay of the present invention comprises contacting a biopsy sample containing 
cells, e.g., macrophages, with a panel of antibodies to one or more of the encoded products to determine the presence 
or absence of the. marker polypeptides. 
45 [0280] The diagnostic methods of the subject invention may also be employed as follow-up to treatment, e.g., quan- 
tification of the level of marker polypeptides may be indicative of the effectiveness of current or previously employed 
therapies for malignant neoplasia and breast cancer in particular as well as the effect of these therapies upon patient 
prognosis. 

[0281] The diagnostic assays described above can be adapted to be used as prognostic assays, as well. Such an 
50 . application takes advantage of the sensitivity of the assays of the Invention to events which take place at characteristic 
stages in the progression of plaque generation in case of malignant neoplasia. For example, a given marker gene may 
be up- or down-regulated at a very early stage, perhaps before the cell is developing into a foam cell, while another 
marker gene may be characteristically up or down regulated only at a much later stage. Such a method could involve 
the steps of contacting the mRNA of a test cell with a polynucleotide probe derived from a given marker polynucleotide 
55 which is expressed at different characteristic levels in breast cancer tissue cells at different stages of malignant neo- 
plasia progression, and determining the approximate amount of hybridization of the probe to the mRNA of the cell, 
. such amount being an indication of the level of expression of the gene in the cell, and thus an indication of the stage 
of disease progression of the cell; alternatively, the assay can be carried out with an antibody specific for the gene 
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product of the given marker polynucleotide, contacted with the proteins of the test cell. A battery of such tests will 
disclose not only the existence of a certain arteriosclerotic plaque, but also will allow the clinician to select the mode 
of treatment most appropriate for the disease, and to predict the likelihood of success of that treatment. 
[0282] The methods of the invention can also be used to follow the clinical course of a given breast cancer predis- 
5 position. For example, the assay of the Invention can be applied to a blood sample from a patient; following treatment 
of the patient for BREAST CANCER, another blood sample is taken and the test repeated. Successful treatment will 
result in removal of demonstrate differential expression, characteristic of the breast cancer tissue cells, perhaps ap- 
proaching or even surpassing normal levels. 

10 Polypeptide activity 

[0283] In one embodiment the present invention provides a method for screening potentially therapeutic agents which 
modulate the activity of one or more "BREAST CANCER GENE" polypeptides, such that if the activity of the polypeptide 
is increased as a result of the upregulation of the "BREAST CANCER GENE" in a subject having or at risk for malignant 

15 neoplasia and breast cancer in particular, the therapeutic substance will decrease the activity of the polypeptide relative 
to the activity of the some polypeptide in a subject not having or not at risk for malignant neoplasia or breast cancer 
in particular but not treated with the therapeutic agent. Likewise, if the activity of the polypeptide as a result of the 
downregulation of the "BREAST CANCER GENE" is decreased in a subject having or at risk for malignant neoplasia 
or breast cancer in particular, the therapeutic agent will increase the activity of the polypeptide relative to the activity 

20 of the same polypeptide in a subject not having or not at risk for malignant neoplasia or breast cancer in particular, but 
not treated with the therapeutic agent. 

[0284] The activity of the "BREAST CANCER GENE" polypeptides indicated in Table 2 or 3 may be measured by 
any means known to those of skill in the art, and which are particular for the type of activity performed by the particular 
polypeptide. Examples of specific assays which may be used to measure the activity of particular polynucleotides are 
25 shown below. 

a) G protein coupled receptors 

[0285] In one embodiment, the "BREAST CANCER GENE" polynucleotide may encode a G protein coupled receptor. 
30 in one embodiment, the present invention provides a method of screening potential modulators (inhibitors or activators) 
of the G protein coupled receptor by measuring changes in the activity of the receptor in the presence of a candidate 
modulator. 

1. G r coupled receptors . . 

35 

[0286] Cells (such as CHO cells or primary cells) are stably transfected with the relevant receptor and with an in- 
ducible CRE-luciferase construct. Cells are grown in 50% Dulbecco's modified Eagle medium / 50% F12 (DMEM/F12) 
supplemented with 10% FBS, at 37°C in a humidified atmosphere with 10% C0 2 and are routinely split at a ratio of 1: 
10 every 2 or 3 days. Test cultures are seeded into 384 - well plates at an appropriate density (e.g. 2000 cells / well in 

40 35 |t| cell culture medium) in DMEM/F12 with FBS, and are grown for 48 hours (range:- 24 - 60 hours, depending on 
cell line). Growth medium is then exchanged against serum free medium (SFM; e.g. Ultra-CHO), containing 0,1 % BSA. 
Test compounds dissolved in DMSO are diluted in SFM and transferred to the test cultures (maximal final concentration 
10 umolar), followed by addition of forskolin (~ 1 u.molar, final cone.) in SFM + 0,1% BSA 10 minutes later. In case of 
antagonist screening both, an appropriate concentration of agonist, and forskolin are added. The plates are incubated 

45 at 37°C in 10% C0 2 for 3 hours. Then the supernatant is removed, cells are lysed with lysis reagent (25 mmolar 
phosphate-buffer, pH 7,8, containing 2 mmolar DDT, 10% glycerol and 3% Triton X100). The luciferase reaction is 
started by addition of substrate-buffer (e.g. luciferase assay reagent, Prom ega) and luminescence is immediately de- 
termined (e.g. Berthold luminometer or Hamamatzu camera system). 

50 2. Grcoupled receptors 

[0287] Cells (such as CHO cells or primary cells) are stably transfected with the relevant receptor and with an in- 
ducible CRE-luciferase construct. Cells are grown in 50% Dulbecco's modified Eagle medium / 50% F12 (DMEM/F12) 
supplemented with 10% FBS, at 37°C in a humidified atmosphere with 10% C0 2 and are routinely split at a ratio of 1: 
55 10 every 2 or 3 days. Test cultures are seeded into 384 - well plates at an appropriate density (e.g. 1000 or 2000 cells 
/well in 35 u.l cell culture medium) in DMEM/F12 with FBS, and are grown for48 hours (range:- 24 - 60 hours, depending 
on cell line). The assay is started by addition of test-compounds in serum free medium (SFM; e.g. Ultra-CHO) containing 
0,1% BSA: Test compounds are dissolved in DMSO, diluted in SFM and transferred to the test cultures (maximal final 
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concentration 1 0 fimolar, DMSO cone. < 0,6 %). In case of antagonist screening an appropriate concentration of agonist 
is added 5-10 minutes later. The plates are incubated at 37°C in 10% C0 2 for 3 hours. Then the cells are lysed with 
10 ul lysis reagent per well (25 mmolar phosphate-buffer, pH 7,8 , containing 2 mmolar DDT, 10% glycerol and 3% 
Triton X100) and the luciferase reaction is started by addition of 20 u.l substrate-buffer per well (e.g. luciferase assay 
5 reagent, Promega). Measurement of luminescence is started immediately (e.g. Berthold luminometer or Hamamatzu 
camera system). 

3. G q -coupled receptors 

10 [0288] Cells (such as CHO cells or primary cells) are stably transfected with the relevant receptor. Cells expressing 
functional receptor protein are grown in 50% Dulbecco's modified Eagle medium / 50% F1 2 (DMEM/F1 2) supplemented 
with 10% FBS, at 37°C in a humidified atmosphere with 5% C0 2 and are routinely split at a cell line dependent ratio 
every 3 or 4 days. Test cultures are seeded into 384 - well plates at an appropriate density (e.g. 2000 cells / well in 35 
\i\ cell culture medium) in DMEM/F12 with FBS, and are grown for 48 hours (range:- 24 - 60 hours, depending on cell 

15 line). Growth medium is then exchanged against physiological salt solution (e.g. Tyrode solution). Test compounds 
dissolved in DMSO are diluted in Tyrode solution containing 0.1% BSA and transferred to the test cultures (maximal 
final concentration 10 molar). After addition of the receptor specific agonist the resulting Gq-mediated intracellular 
calcium increase is measured using appropriate read-out systems (e.g. calcium-sensitive dyes). 

20 b) Ion channels 

[0289] Ion channels are integral membrane proteins involved in electrical signaling, transmembrane signal transduc- 
tion, and electrolyte and solute transport. By forming macromolecular pores through the membrane lipid bilayer, ion 
channels account for the flow of specific ion species driven by the electrochemical potential gradient for the permeating 

25 ion. At the single molecule level, individual channels undergo conformational transitions ("gating") between the 'open' 
(ion conducting) and 'closed' (non conducting) state. Typical single channel openings last for a few milliseconds and 
result in elementary transmembrane currents in the range of 1 0r 9 - 10" 12 Ampere. Channel gating is controlled by 
various chemical and/or biophysical parameters, such as neurotransmitters and intracellular second messengers ('lig- 
and-gated* channels) or membrane potential ('voltage-gated' channels). Ion channels are functionally characterized 

30 by their ion selectivity, gating properties, and regulation by hormones and pharmacological agents. Because of their 
central role in signaling and transport processes, ion channels present ideal targets for pharmacological therapeutics 
in various pathophysiological settings. 

[0290] In one embodiment, the "BREAST CANCER GENE" may encode an ion channel. In one embodiment, the 
present invention provides a method of screening potential activators or inhibitors of channels activity of the "BREAST 
35 CANCER GENE" polypeptide. Screening for compounds interaction with ion channels to either inhibit or promote their 
activity can be based on (1 .) binding and (2.) functional assays in living cells[ Hide (183)]. 

1. For ligand-gated channels, e.g. ionotropic neurotransmitter/hormone receptors, assays can be designed de- 
tecting binding to the target by competition between the compound and a labeled ligand. 

40 . 

2. Ion channel function can be tested functionally in living cells. Target proteins are either expressed endogenously 
in appropriate reporter cells or are introduced recombinantly. Channel activity can be monitored by (2.1) concen- 
tration changes of the permeating ion (most prominently Ca 2+ ions), (2.2) by changes in the transmembrane elec- 
trical potential gradient, and (2.3) by measuring a cellular response (e.g. expression of a reporter gene, secretion 

45 of a neurotransmitter) triggered or modulated by the target activity. 

2.1 Channel activity results in transmembrane ion fluxes. Thus activation of ionic channels can be monitored 
by the resulting changes in intracellular ion concentrations using luminescent or fluorescent indicators. Be- 
cause of its wide dynamic range and availability of suitable indicators this applies particularly to changes in 
50 intracellular Ca 2+ ion concentration ([Ca 2+ ]i). JCa 2+ ]i can be measured, for example, by aequorin luminescence 

or fluorescence dye technology (e.g. using Fluo-3, lndo-1 , Fura-2). Cellular assays can be designed where 
either the Ca 2+ flux through the target channel itself is measured directly or where modulation of the target 
channel affects membrane potential and thereby the activity of co-expressed voltage-gated Ca 2 * channels. 

55 2.2 Ion channel currents result in changes of electrical membrane potential (V m ) which can be monitored 

directly using potentiometric fluorescent probes. These electrically charged indicators (e.g. . the anionic oxonol 
dye DiBAC 4 (3)) redistribute between extra- and intracellular compartment in response to voltage changes. 
The equilibrium distribution is governed by the Nemst-equation. Thus changes in membrane potential results 
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in concomitant changes in cellular fluorescence. Again, changes in V m might be caused directly by the activity 
of the target ion channel or through amplification and/or prolongation of the signal by channels co-expressed 
in the same cell. 

s 2.3 Target channel activity can cause cellular Ca 2+ entry either directly or through activation of additional Ca 2+ 

channel (see 2. 1 ). The resulting intracellular Ca 2+ signals regulate a variety of cellular responses, e.g. secretion 
or gene transcription. Therefore modulation of the target channel can be detected by monitoring secretion of 
a known hormone/transmitter from the target-expressing cell or through expression of a reporter gene (e.g. 
luciferase) controlled by an Ca 2+ -responsive promoter element (e.g. cyclic AMP/ Ca2 + -responsive elements; 

10 CRE). 

c) DNA-binding proteins and transcription factors 

[0291] In one embodiment, the "BREAST CANCER GENE" may encode a DNA-binding protein or a transcription 
15 factor. The activity of such a DNA-binding protein or a transcription factor may be measured, for example, by a promoter 
assay which measures the ability of the DNA-binding protein or the transcription factor to initiate transcription of a test 
sequence linked to a particular promoter. In one embodiment, the present invention provides a method of screening 
test compounds for its ability to modulate the activity of such a DNA-binding protein or a transcription factor by meas- 
uring the changes in the expression of a test gene which is regulated by a promoter which is responsive to the tran- 
20 scription factor. 

d) Promotor assays 

[0292] A promoter assay was set up with a human hepatocellular carcinoma cell HepG2 that was stably transfected 
25 with a luciferase gene under the control of a gene of interest (e.g. thyroid hormone) regulated promoter. The vector 
2xlROIuc, which was used for transfection, carries a thyroid hormone responsive element (TRE) of two 12 bp inverted 
palindromes separated by an 8 bp spacer in front of a tk minimal promoter and the luciferase gene. Test cultures were 
seeded in 96 well plates in serum - free Eagle's Minimal Essential Medium supplemented with glutamine, tricine, sodium 
pyruvate, non - essential amino acids, insulin, selen, transferrin, and were cultivated in a humidified atmosphere at 10 
30 % C0 2 at 37°C. After 48 hours of incubation serial dilutions of test compounds or reference compounds (L-T3, L-T4 
e.g.) and co-stimulator if appropriate (final concentration 1 nM) were added to the cell cultures and incubation was 
continued for the optimal time (e.g. another 4-72 hours). The cells were then lysed by addition of buffer containing 
Triton X100 and luciferin and the luminescence of luciferase induced by T3 or other compounds was measured in a 
luminometer. For each concentration of a test compound replicates of 4 were tested. EC 50 — values for each test 
35 compound were calculated by use of the Graph Pad Prism Scientific software. 

Screening Methods 

[0293] The invention provides assays for screening test compounds which bind to or modulate the activity of a 
40 "BREAST CANCER GENE" polypeptide or a "BREAST CANCER GENE" polynucleotide. A test compound preferably 
binds to a "BREAST CANCER GENE" polypeptide or polynucleotide. More preferably, a test compound decreases or 
increases "BREAST CANCER GENE" activity by at least about 10, preferably about 50, more preferably about 75, 90, 
or 100% relative to the absence of the test compound. 

45 Test Compounds * 

[0294] Test compounds can be pharmacological agents already known in the art or can be compounds previously 
unknown to have any pharmacological activity. The compounds can be naturally occurring or designed in the laboratory. 
They can be isolated from microorganisms, animals, or plants, and can be produced recombinant, or synthesised by 

50 chemical methods known in the art. If desired, test compounds can be obtained using any of the numerous combinatorial 
library methods known in the art, including but not limited to, biological libraries, spatially addressable parallel solid 
phase or solution phase libraries, synthetic library methods requiring de-convolution, the one-bead one-compound 
library method, and synthetic library methods using affinity chromatography selection. The biological library approach 
is limited to polypeptide libraries, while the other four approaches are applicable to polypeptide, non-peptide oligomer, 

55 or small molecule libraries of compounds. [For review see Lam, 1 997, (1 51 )]. 

[0295] Methods for the synthesis of molecular libraries are well known in the art [see, for example, DeWitt et al., 
1993, (152); Erb et al., 1994, (153); Zuckermann et al., 1994, (154); Cho et al., 1993, (155); Carell et al., 1994, (156) 
and Gallop eta!., 19914, (157). Libraries of compounds can be presented in solution [see, e.g., Houghten, 1992, (158)], 
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or on beads [Lam, 1991, (159)], DNA-chips [Fodor, 1993, (160)], bacteria or spores (Ladner, U.S. Patent 5,223,409), 
plasmids [Cull et al., 1992, (161)], or phage [Scott & Smith, 1990, (162); Devlin, 1990, (163); Cwirla et al., 1990, (164); 
Felici, 1991, (165)]. 

5 High Throughput Screening 

[0296] Test compounds can be screened for the ability to bind to "BREAST CANCER GENE" polypeptides or poly- 
nucleotides or to affect "BREAST CANCER GENE" activity or "BREAST CANCER GENE" expression using high 
throughput screening. Using high throughput screening, many discrete compounds can be tested in parallel so that 
10 large numbers of test compounds can be quickly screened. The most widely established techniques utilize 96-well, 
384-well or 1536-well microtiter plates. The wells of the microtiter plates typically require assay volumes that range 
from 5 to 500 uJ. In addition to the plates, many instruments, materials, pipettors, robotics, plate washers, and plate 
readers are commercially available to fit the microwell formats. 

[0297] Alternatively, free format assays, or assays that have no physical barrier between samples, can be used. For 
*5 example, an assay using pigment cells (melanocytes) in a simple homogeneous assay for combinatorial peptide li- 
braries is described by Jayawickreme et al., (166). The cells are placed under agarose in culture dishes, then beads 
that carry combinatorial compounds are placed on the surface of the agarose. The combinatorial compounds are 
partially released the compounds from the beads. Active compounds can be visualised as dark pigment areas because, 
as the compounds diffuse locally into the gel matrix, the active compounds cause the cells to change colors. 
20 [0298] Another example of a free format assay is described by Chelsky, (1 67). Chelsky placed a simple homogenous 
enzyme assay for carbonic anhydrase inside an agarose gel such that the enzyme in the gel would cause a color 
change throughout the gel. 

[0299] Thereafter, beads carrying combinatorial compounds via a photolinker were placed inside the gel and the 
compounds were partially released by UV light. Compounds that inhibited the enzyme were observed as local zones 
25 of inhibition having less color change. 

[0300] In another example, combinatorial libraries were screened for compounds that had cytotoxic effects on cancer 
cells growing in agar [Salmon et al., 1996, (168)]. 

[0301] Another high throughput screening method is described in Beutel et al., U.S. Patent 5,976,81 3. In this method, 
test samples are placed in a porous matrix. One or more assay components are then placed within, on top of, or at 
30 the bottom of a matrix such as a gel, a plastic sheet, a filter, or other form of easily manipulated solid support. When 
samples are introduced to the porous matrix they diffuse sufficiently slowly, such that the assays can be performed 
without the test samples running together. 

Binding Assays 

35 

[0302] For binding assays, the test compound is preferably a small molecule which binds to and occupies, for ex- 
ample, the ATP/GTP binding site of the enzyme or the active site of a "BREAST CANCER GENE" polypeptide, such 
that normal biological activity is prevented. Examples of such small molecules include, but are. not limited to, small 
peptides or peptide-like molecules. 

40 [0303] In binding assays, either the test compound or a "BREAST CANCER GENE" polypeptide can comprise a 
detectable label, such as a fluorescent, radioisotopic, chemiluminescent, or enzymatic label, such as horseradish per- 
oxidase, alkaline phosphatase, or luciferase. Detection of a test compound which is bound to a "BREAST CANCER 
GENE" polypeptide can then be accomplished, for example, by direct counting of radioemmission, by scintillation count- 
ing, or by determining conversion of an appropriate substrate to a detectable product. 

45 [0304] Alternatively, binding of a test compound to a "BREAST CANCER GENE" polypeptide can be determined 
without labeling either of the interactants. For example, a microphysiometer can be used to detect binding of a test 
compound with a "BREAST CANCER GENE" polypeptide. A microphysiometer (e.g., CytosensorJ) is an analytical 
instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric 
sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a test compound 

so and a "BREAST CANCER GENE" polypeptide [McConnell et al., 1992, (169)]. 

. [0305] Determining the ability of a test compound to bind to a "BREAST CANCER GENE" polypeptide also can be 
accomplished using a technology such as real-time Bimolecular Interaction Analysis (BIA) [Sjolander & Urbaniczky, 
1 991 , (1 70), and Szabo et al., 1 995, (171)]. BIA is a technology for studying biospecific interactions in real time, without 
labeling any of the interactants (e.g., BIAcore™). Changes in the optical phenomenon surface plasmon resonance 

55 (SPR) can be used as an indication of real-time reactions between biological molecules. . 

[0306] In yet another aspect of the invention, a "BREAST CANCER GENE" polypeptide can be used as a "bait 
protein" in a two-hybrid assay or three-hybrid assay [see, e.g., U.S. Patent 5,283,317; Zervos et aL, 1993, (172); 
Madura et aL, 1993, (173); Bartel et al., 1993, (174); Iwabuchi et al., 1993, (175) and Brent WO 94/10300], to identify 
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other proteins which bind to or interact with the "BREAST CANCER GENE" polypeptide and modulate its activity. 
[0307] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable 
DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. For example, in one 
construct, polynucleotide encoding a "BREAST CANCER GENE" polypeptide can be fused to a polynucleotide encod- 

5 ing the DNA binding domain of a known transcription factor (e.g., GAL4). In the other construct a DNA sequence that 
encodes an unidentified protein ("prey" or "sample") can be fused to a polynucleotide that codes for the activation 
domain of the known transcription factor. If the "bait" and the "prey" proteins are able to interact in vivo to form an 
protein- dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close 
proximity. This proximity allows transcription of a reporter gene (e.g., LacZ), which is operably linked to a transcriptional 

10 regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected, and cell colonies 
containing the functional transcription factor can be isolated and used to obtain the DNA sequence encoding the protein 
which interacts with the "BREAST CANCER GENE" polypeptide. 

[0308] It may be desirable to immobilize either a "BREAST CANCER GENE" polypeptide (or polynucleotide) or the 
test compound to facilitate separation of bound from unbound forms of one or both of the interactants, as well as to 

15 accommodate automation of the assay. Thus, either a "BREAST CANCER GENE" polypeptide (or polynucleotide) or 
the test compound can be bound to a solid support. Suitable solid supports include, but are not limited to, glass or 
plastic slides, tissue culture plates, microtiter wells, tubes, silicon chips, or particles such as beads (including, but not 
limited to, latex, polystyrene, or glass beads). Any method known in the art can be used to attach a "BREAST CANCER 
GENE" polypeptide (or polynucleotide) or test compound to a solid support, including use of covalent and non-covalent 

20 linkages, passive absorption, or pairs of binding moieties attached respectively to the polypeptide (or polynucleotide) 
or test compound and the solid support. Test compounds are preferably bound to the solid support in an array, so that 
the location of individual test compounds can be tracked. Binding of a test compound to a "BREAST CANCER GENE" 
polypeptide (or polynucleotide) can be accomplished in any vessel suitable for containing the reactants. Examples of 
such vessels include microtiter plates, test tubes, and microcentrifuge tubes. 

25 [0309] In one embodiment, a "BREAST CANCER GENE" polypeptide is a fusion protein comprising a domain that 
allows the "BREAST CANCER GENE" polypeptide to be bound to a solid support' For example, glutathione S-trans- 
ferase fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glu- 
tathione derivatized microtiter plates, which are then combined with the test compound or the test compound and the 
nonadsorbed "BREAST CANCER GENE" polypeptide; the mixture is then incubated under conditions conducive to 

30 complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate 
wells are washed to remove any unbound components. Binding of the interactants can be determined either directly 
or indirectly, as described above. Alternatively, the complexes can be dissociated from the solid support before binding 
is determined. 

[0310] Other techniques for immobilising proteins or polynucleotides on a solid support also can be used in the 
35 screening assays of the invention. For example, either a "BREAST CANCER GENE" polypeptide (or polynucleotide) 
or a test compound can be immobilized utilizing conjugation of biotin and streptavidin. Biotinylated "BREAST CANCER 
GENE" polypeptides (or polynucleotides) or test compounds can be prepared from biotin NHS (N-hydroxysuccinimide) 
using techniques well known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, III.) and immobilized in the 
wells of streptavidin-coated 96 well plates (Pierce Chemical). Alternatively, antibodies which specifically bind to a 
^0 "BREAST CANCER GENE" polypeptide, polynucleotide, or a test compound, but which do not interfere with a desired 
binding site, such as the ATP/GTP binding site or the active site of the "BREAST CANCER GENE" polypeptide, can 
be derivatised to the wells of the plate. Unbound target or protein can be trapped in the wells by antibody conjugation. 
[0311] Methods for detecting such complexes, in addition to those described above for the GST-immobilized com- 
plexes, include immunodetection of complexes using antibodies which specifically bind to a "BRBAST CANCER GENE" ' 
45 polypeptide or test compound, enzyme-linked assays which rely on detecting an activity of a "BREAST CANCER 
GENE" polypeptide, and SDS gel electrophoresis under non-reducing conditions. 

[0312] Screening for test compounds which bind to a "BREAST CANCER GENE" polypeptide or polynucleotide also 
can be carried out in an intact cell. Any cell which comprises a "BREAST CANCER GENE" polypeptide or polynucleotide 
can be used in a cell-based assay system. A "BREAST CANCER GENE" polynucleotide can be naturally occurring in 
50 the cell or can be introduced using techniques such as those described above. Binding of the test compound to a 
"BREAST CANCER GENE" polypeptide or polynucleotide is determined as described above. 

Modulation of Gene Expression 

55 [0313] In another embodiment, test compounds which increase or decrease "BREAST CANCER GENE" expression 
are identified. A "BREAST CANCER GENE" polynucleotide is contacted with a test compound, and the expression of 
an RNA or polypeptide product of the "BREAST CANCER GENE" polynucleotide is determined. The level of expression 
of appropriate mRNA or polypeptide in the presence of the test compound is compared to the level of expression of 
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mRNA or polypeptide in the absence of the test compound. The test compound can then be identified as a modulator 
of expression based on this comparison. For example, when expression of mRNA or polypeptide is greater in the 
presence of the test compound than in its absence, the test compound is identified as a stimulator or enhancer of the 
mRNA or polypeptide expression. Alternatively, when expression of the mRNA or polypeptide is less in the presence 
5 of the test compound than in its absence, the test compound is identified as an inhibitor of the mRNA or polypeptide 
expression. 

[0314] The level of "BREAST CANCER GENE" mRNA or polypeptide expression in the cells can be determined by 
methods well known in the art for detecting mRNA or polypeptide. Either qualitative or quantitative methods can be 
used. The presence of polypeptide products of a "BREAST CANCER GENE" polynucleotide can be determined, for 
10 example, using a variety of techniques known in the art, including immunochemical methods such as radioimmu- 
noassay, Western blotting, and immunohistochemistry. Alternatively, polypeptide synthesis can be determined in vivo, 
in a cell culture, or in an in vitro translation system by detecting incorporation of labeled amino acids into a "BREAST 
CANCER GENE" polypeptide. 

[0315] Such screening can be carried out either in a cell -free assay system or in an intact cell. Any cell which ex- 
's presses a "BREAST CANCER GENE" polynucleotide can be used in a cell-based assay system. A "BREAST CANCER 
GENE" polynucleotide can be naturally occurring in the cell or can be introduced using techniques such as those 
described above. Either a primary culture or an established cell line, such as CHO or human embryonic kidney 293 
cells, can be used. 

20 Therapeutic Indications and Methods 

[0316] Therapies for treatment of breast cancer primarily relied upon effective chemotherapeutic drugs for interven- 
tion on the cell proliferation, cell growth or angiogenesis. The advent of genomics-driven molecular target identification 
has opened up the possibility of identifying new breast cancer-specific targets for therapeutic intervention that will 

25 provide safer, more effective treatments for malignant neoplasia patients and breast cancer patients in particular. Thus, 
newly discovered breast cancer-associated genes and their products can be used as tools to develop innovative ther- 
apies. The identification of the Her2/neu receptor kinase presents exciting new opportunities for treatment of a certain 
subset of tumor patients as described before. Genes playing important roles in any of the physiological processes 
outlined above can be characterized as breast cancer targets. Genes or gene fragments identified through genomics 

30 can readily be expressed in one or more heterologous expression systems to produce functional recombinant proteins. 
These proteins are characterized in vitro for their biochemical properties and then used as tools in high-throughput 
molecular screening programs to identify chemical modulators of their biochemical activities. Modulators of target gene 
expression or protein activity can be identified in this manner and subsequently tested in cellular and in vivo disease 
models for therapeutic activity. Optimization of lead compounds with iterative testing in biological models and detailed 

35 pharmacokinetic and toxicological analyses form the basis for drug development and subsequent testing in humans. 
[0317] This invention further pertains to the use of novel agents identified by the screening assays described above. 
Accordingly, it is within the scope of this invention to use a test compound identified as described herein in an appropriate 
animal model. For example, an agent identified as described herein (e.g., a modulating agent, an antisense polynu- 
cleotide molecule, a specific antibody, ribozyme, or a human "BREAST CANCER GENE" polypeptide binding molecule) 

40. can be used in an animal model to determine the efficacy, toxicity, or side effects of treatment with such an agent. 
Alternatively, an agent identified as described herein can be used in an animal model to determine the mechanism of 
action of such an agent. Furthermore, this invention pertains to uses of novel agents identified by the above described 
screening assays for treatments as described herein. 

[0318] A reagent which affects human "BREAST CANCER GENE" activity can be administered to a human cell, 
45 either in vitro or in vivo, to reduce or increase human "BREAST CANCER GENE" activity. The reagent preferably binds 
to an expression product of a human "BREAST CANCER GENE". If the expression product is a protein, the reagent 
is preferably an antibody. For treatment of human cells ex vivo, an antibody can be added to a preparation of stem 
cells which have been removed from the body. The cells can then be replaced in the same or another human body, 
with or without clonal propagation, as is known in the art. 
50 [0319] In one embodiment, the reagent is delivered using a liposome. Preferably, the liposome is stable in the animal 
into which it has been administered for at least about 30 minutes, more preferably for at least about 1 hour, and even 
more preferably for at least about 24 hours. A liposome comprises a lipid composition that is capable of targeting a 
reagent, particularly a polynucleotide, to a particular site in an animal, such as a human. Preferably, the lipid composition 
of the liposome is capable of targeting to a specific organ of an animal, such as the lung, liver, spleen, heart brain, 
55 lymph nodes, and skin. 

[0320] A liposome useful in the present invention comprises a lipid composition that is capable of fusing with the 
plasma membrane of the targeted cell to deliver its contents to the cell. Preferably, the transfection efficiency of a. 
liposome is about 0.5 \ig of DNA per 16 nmol of liposome delivered to about 10 6 cells, more preferably about 1.0 u.g 
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of DNA per 16 nmol of liposome delivered to about 10 6 cells, and even more preferably about 2.0 \ig of DNA per 16 
nmol of liposome delivered to about 1 0 6 cells. Preferably, a liposome is between about 1 00 and 500 nm, more preferably 
between about 150 and 450 nm, and even more preferably between about 200 and 400 nm in diameter. 
[0321] Suitable liposomes for use in the present invention include those liposomes usually used in, for example, 
s gene delivery methods known to those of skill in the art. More preferred liposomes include liposomes having a poly- 
cationic lipid composition and/or liposomes having a cholesterol backbone conjugated to polyethylene glycol. Option- 
ally, a liposome comprises a compound capable of targeting the liposome to a particular cell type, such as a cell-specific 
ligand exposed on the outer surface of the liposome. 

[0322] Complexing a liposome with a reagent such as an antisense oligonucleotide or ribozyme can be achieved 
10 using methods which are standard in the art (see, for example, U.S. Patent 5,705,151). Preferably, from about 0.1 \ig 
to about 10 ug of polynucleotide is combined with about 8 nmol of liposomes, more preferably from about 0.5 jig to 
about 5 ug of polynucleotides are combined with about 8 nmol liposomes, and even more preferably about 1.0 u.g of 
polynucleotides is combined with about 8 nmol liposomes. 

[0323] In another embodiment, antibodies can be delivered to specific tissues in vivo using receptor-mediated tar- 
1$ geted delivery. Receptor-mediated DNA delivery techniques are taught in, for example, Findeis et a!., 1993, (176); 
Chiouetal., 1994. (177); Wu & Wu, 1988, (178); Wu eta!., 1994, (179); Zenke etal., 1990, (180); Wu etal., 1991. (181). 

Determination of a Therapeutically Effective Dose 

20 [0324] The determination of a therapeutically effective dose is well within the capability of those skilled in the art. A 
therapeutically effective dose refers to that amount of active ingredient which increases or decreases human "BREAST 
CANCER GENE" activity relative to the human "BREAST CANCER GENE" activity which occurs in the absence of the 
therapeutically effective dose. 

[0325] For any compound, the therapeutically effective dose can be estimated initially either in cell culture assays 
25 or in animal models, usually mice, rabbits, dogs, or pigs. The animal model also can be used to determine the appro- 
priate concentration range and route of administration. Such information can then be used to determine useful doses 
and routes for administration in humans. 

[0326] Therapeutic efficacy and toxicity, e.g., ED 50 (the dose therapeutically effective in 50% of the population) and 
LD 50 ( tne dose lethal to 50% of the population), can be determined by standard pharmaceutical procedures in cell 
30 cultures or experimental animals. The dose ratio of toxic to therapeutic effects is the therapeutic index, and it can be 
expressed as the ratio, LD 5o /ED 5o . 

[0327] Pharmaceutical compositions which exhibit large therapeutic indices are preferred. The data obtained from 
cell culture assays and animal studies is used in formulating a range of dosage for human use. The dosage contained 
in such compositions is preferably within a range of circulating concentrations that include the ED 50 with little or no 
35 toxicity. The dosage varies within this range depending upon the dosage form employed, sensitivity of the patient, and 
the route of administration. 

[0328] The exact dosage will be determined by the practitioner, in light of factors related to the subject that requires 
treatment. Dosage and administration are adjusted to provide sufficient levels of the active ingredient or to maintain 
the desired effect. Factors which can be taken into account include the severity of the disease state, general health of 
*o the subject, age, weight, and gender of the subject, diet, time and frequency of administration, drug combination(s), 
reaction sensitivities, and tolerance/response to therapy. Long-acting pharmaceutical compositions can be adminis- 
tered every 3 to 4 days, every week, or once every two weeks depending on the half-life and clearance rate of the 
particular formulation. 

[0329] Normal dosage amounts can vary from 0.1 to 100,000 micrograms, up to a total dose of about 1 g, depending 
45 upon the route of administration. Guidance as to particular dosages and methods of delivery is provided in the literature 
and generally available to practitioners in the art. Those skilled in the art will employ different formulations for nucleotides 
than for proteins or their inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular 
cells, conditions, locations, etc. 

[0330] If the reagent is a single-chain antibody, polynucleotides encoding the antibody can be constructed and in- 
50 troduced into a cell either ex vivo or in vivo using well-established techniques including, but not limited to, transferrin- 
polycation-mediated DNA transfer, transfection with naked or encapsulated nucleic acids, liposome-mediated cellular 
fusion, intracellular transportation of DNA-coated latex beads, protoplast fusion, viral infection, electroporation, a gene 
gun, and DEAE- or calcium phosphate-mediated transfection. . 

[0331] Effective in vivo dosages of an antibody are in the range of about 5 u.g to about 50 fig/kg, about 50 ug to about 
55 5 mg/kg, about 100 u.g to about 500 ug/kg of patient body weight, and about 200 to about 250 u.g/kg of patient body 
weight For administration of polynucleotides encoding single-chain antibodies, effective in vivo dosages are in the 
range of about 100 ng to about 200 ng, 500 ng to about 50 mg, about 1 u.g to about 2 mg, about 5 \ig to about 500 u.g, 
and about 20 ng to about 100 u.g of DNA. 
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[0332] If the expression product is mRNA, the reagent is preferably an antisense oligonucleotide or a ribozyme. 
Polynucleotides which express antisense oligonucleotides or ribozymes can be introduced into cells by a variety of 
methods, as described above. 

[0333] Preferably, a reagent reduces expression of a "BREAST CANCER GENE" gene or the activity of a "BREAST 
s CANCER GENE" polypeptide by at least about 10, preferably about 50, more preferably about 75, 90, or 1 00% relative 
to the absence of the reagent. The effectiveness of the mechanism chosen to decrease the level of expression of a 
"BREAST CANCER GENE" gene or the activity of a "BREAST CANCER GENE" polypeptide can be assessed using 
methods well known in the art, such as hybridization of nucleotide probes to "BREAST CANCER GENE"-specific mR- 
NA, quantitative RT-PCR, immunologic detection of a "BREAST CANCER GENE" polypeptide, or measurement of 
10 -BREAST CANCER GENE" activity. 

[0334] In any of the embodiments described above, any of the pharmaceutical compositions of the invention can be 
administered in combination with other appropriate therapeutic agents. Selection of the appropriate agents for use in 
combination therapy can be made by one of ordinary skill in the art, according to conventional pharmaceutical principles. 
The combination of therapeutic agents can act synergistically to effect the treatment or prevention of the various dis- 
15 orders described above. Using this approach, one may be able to achieve therapeutic efficacy with lower dosages of 
each agent, thus reducing the potential for adverse side effects. 

[0335] Any of the therapeutic methods described above can be applied to any subject in need of such therapy, 
including, for example, birds and mammals such as dogs, cats, cows, pigs, sheep, goats, horses, rabbits, monkeys, 
and most preferably, humans. 

20 [0336] All patents and patent applications cited in this disclosure are expressly incorporated herein by reference. 
The above disclosure generally describes the present invention. A more complete understanding can be obtained by 
reference to the following specific examples which are provided for purposes of illustration only and are not intended 
to limit the scope of the invention. 

25 Pharmaceutical Compositions 

[0337] The invention also provides pharmaceutical compositions which can be administered to a patient to achieve 
a therapeutic effect. Pharmaceutical compositions of the invention can comprise, for example, a "BREAST CANCER 
GENE" polypeptide, "BREAST CANCER GENE" polynucleotide, ribozymes or antisense oligonucleotides, antibodies 

30 which specifically bind to a "BREAST CANCER GENE" polypeptide, or mimetics, agonists, antagonists, or inhibitors 
of a "BREAST CANCER GENE" polypeptide activity. The compositions can be administered alone or in combination 
with at least one other agent, such as stabilizing compound, which can be administered in any sterile, biocompatible 
pharmaceutical carrier, including, but not limited to, saline, buffered saline, dextrose, and water. The compositions can 
be administered to a patient alone, or in combination with other agents, drugs or hormones. 

35 [0338] In addition to the active ingredients, these pharmaceutical compositions can contain suitable pharmaceutically 
acceptable carriers comprising excipients and auxiliaries which facilitate processing of the active compounds into prep- 
arations which can be used pharmaceutically. Pharmaceutical compositions of the invention can be administered by 
any number of routes including, but not limited to, oral, intravenous, intramuscular, intraarterial, intramedullary, intrath- 
ecal, intraventricular, transdermal, subcutaneous, intraperitoneal, intranasal, parenteral, topical, sublingual, or rectal 
means. Pharmaceutical compositions for oral administration can be formulated using pharmaceutically acceptable 
carriers well known in the art in dosages suitable for oral administration. Such carriers enable the pharmaceutical 
compositions to be formulated as tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions,, and the 
like, for ingestion by the patient. 

[0339] Pharmaceutical preparations for oral use can be obtained through combination of active compounds with 
45 solid excipient, optionally grinding a resulting mixture, and processing the mixture of granules, after adding suitable 
auxiliaries, if desired, to obtain tablets or dragee cores, suitable excipients are carbohydrate or protein fillers, such as 
sugars, including lactose, sucrose, mannitol, or sorbitol; starch from corn, wheat, rice, potato, or other plants; cellulose, 
such as methyl cellulose, hydroxypropylmethylcellulose, or sodium carboxymethylcellulose; gums including arabicand 
tragacanth; and proteins such as gelatin and collagen. If desired, disintegrating or solubilizing agents can be added, 
50 such as the cross-linked polyvinyl pyrrolidone, agar, alginic acid, or a salt thereof, such as sodium alginate. 

[0340] Dragee cores can be used in conjunction with suitable coatings, such as concentrated sugar solutions, which 
also can contain gum arabic, talc, polyvinylpyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lac- 
quer solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments can be added to the tablets 
or dragee coatings for product identification or to characterize the quantity of active compound, i.e., dosage. 
55 [0341] Pharmaceutical preparations which can be used orally include push-fit capsules made of gelatin, as well as 
soft, sealed capsules made of gelatin and a coating, such as glycerol or sorbitol. Push-fit capsules can contain active 
ingredients mixed with a filler or binders, such as lactose or starches, lubricants, such as talc or magnesium stearate, 
and, optionally, stabilizers. In soft capsules, the active compounds can be dissolved or suspended in suitable liquids, 
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such as fatty oils, liquid, or liquid polyethylene glycol with or without stabilizers. 

[0342] Pharmaceutical formulations suitable for parenteral administration can be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks' solution, Ringer's solution, or physiologically buffered 
saline. Aqueous injection suspensions can contain substances which increase the viscosity of the suspension, such 

5 as sodium carboxymethyl cellulose, sorbitol, or dextran. Additionally, suspensions of the active compounds can be 
prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils such as 
sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Non-lipid polycationic 
amino polymers also can be used for delivery. Optionally, the suspension also can contain suitable stabilizers or agents 
which increase the solubility of the compounds to allow for the preparation of highly concentrated solutions. For topical 

10 or nasal administration, penetrants appropriate to the particular barrier to be permeated are used in the formulation. 
Such penetrants are generally known in the art. 

[0343] The pharmaceutical compositions of the present invention can be manufactured in a manner that is known 
in the art, e.g., by means of conventional mixing, dissolving, granulating, dragee making, levigating, emulsifying, en- 
capsulating, entrapping, or lyophilizing processes. The pharmaceutical composition can be provided as a salt and can 

15 be formed with many acids, including but not limited to, hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, 
etc. Salts tend to be more soluble in aqueous or other protonic solvents than are the corresponding free base forms. 
In other cases, the preferred preparation can be a lyophilized powder which can contain any or all of the following: 150 
mM histidine, 0.1 %2% sucrose, and 27% mannitol, at a pH range of 4.5 to 5.5, that is combined with buffer prior to use. 
[0344] Further details on techniques for formulation and administration can be found in the latest edition of REM- 

20 INGTON'S PHARMACEUTICAL SCIENCES (1 82). After pharmaceutical compositions have been prepared, they can 
be placed in an appropriate container and labeled for treatment of an indicated condition. Such labeling would include 
amount, frequency, and method of administration. 

Mate ha I and Methods 

25 

[0345] One strategy for identifying genes that are involved in breast cancer is to detect genes that are expressed 
differentially under conditions associated with the disease versus non-disease conditions. The sub-sections below 
describe a number of experimental systems which may be used to detect such differentially expressed genes. In gen- 
eral, these experimental systems include at least one experimental condition in which subjects or samples are treated 
30 in a manner associated with breast cancer, in addition to at least one experimental control condition lacking such 
disease associated treatment. Differentially expressed genes are detected, as described below, by comparing the 
pattern of gene expression between the experimental and control conditions. 

[0346] Once a particular gene has been identified through the use of one such experiment, its expression pattern 
may be further characterized by studying its expression in a different experiment and the findings may be validated by 
35 an independent technique. Such use of multiple experiments may be useful in distinguishing the roles and relative 
importance of particular genes in breast cancer. A combined approach, comparing gene expression pattern in cells 
derived from breast cancer patients to those of in vitro cell culture models can give substantial hints on the pathways 
involved in development and/or progression of breast cancer. 

[0347] Among the experiments which may be utilized for the identification of differentially expressed genes involved 
40 jn malignant neoplasia and breast cancer, for example, are experiments designed to analyze those genes which are 
involved in signal transduction. Such experiments may serve to identify genes involved in the proliferation of cells. 
[0348] . Below are methods described for the identification of genes which are involved in breast cancer. Such rep- 
resent genes which are differentially expressed in breast cancer conditions relative to their expression in normal, or 
non-breast cancer conditions or upon experimental manipulation based on clinical observations. Such differentially 
45 expressed genes represent "target" and/or "marker" genes. Methods for the further characterization of such differen- 
tially expressed genes, and for their identification as target and/or marker genes, are presented below. 
[0349] Alternatively, a differentially expressed gene may have its expression modulated, i.e., quantitatively increased 
or decreased, in normal versus breast cancer states, or under control versus experimental conditions. The degree to 
which expression differs in normal versus breast cancer or control versus experimental states need only be large 
so enough to be visualized via standard characterization techniques, such as, for example, the differential display tech- 
nique described below. Other such standard characterization techniques by which expression differences may be vis- 
, ualized include but are not limited to quantitative RT-PCR and Northern analyses, which are well known to those of 
skill in the art. 

55 
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EXAMPLE 1 

Expression profiling 

5 a) Expression profiling utilizing quantitative RT-PCR 

[0350] • For a detailed analysis of gene expression by quantitative PCR methods, one will utilize primers flanking the 
genomic region of interest and a fluorescent labeled probe hybridizing in-between. Using the PRISM 7700 Sequence 
Detection System of PE Applied Biosystems (Perkin Elmer, Foster City, CA, USA) with the technique of a fluorogenic 

io probe, consisting of an oligonucleotide labeled with both a fluorescent reporter dye and a quencher dye, one can 
perform such a expression measurement. Amplification of the probe-specific product causes cleavage of the probe, 
generating an increase in reporter fluorescence. Primers and probes were selected using the Primer Express software 
and localized mostly in the 3' region of the coding sequence or in the 3' untranslated region (see Table 5 for primer- 
and probe- sequences) according to the relative positions of the probe sequence used for the construction of the 

15 Affymetrix HG JJ95A-E or HG-U 1 33A-B DNA-chips. All primer pairs were checked for specificity by conventional PCR 
reactions. To standardize the amount of sample RNA, G APDH was selected as a reference, since it was not differentially 
regulated in the samples analyzed. TaqMan validation experiments were performed showing that the efficiencies of 
the target and the control amplifications are approximately equal which is a prerequisite for the relative quantification 
of gene expression by the comparative AAC T method, known to those with skills in the art. 

20 [0351] As well as the technology provided by Perkin Elmer one may use other technique implementations like Light- 
cycler™ from Roche Inc. or iCycler from Stratagene Inc.. 

b) Expression profiling utilizing DNA microarrays 

25 [0352] Expression profiling can bee carried out using the Affymetrix Array Technology. By hybridization of mRNA to 
such a DNA-array or DNA-Chip, it is possible to identify the expression value of each transcripts due to signal intensity 
at certain position of the array. Usually these DNA-arrays are produced by spotting of cDNA, oligonucleotides or sub- 
cloned DNA fragments. In case of Affymetrix technology app. 400.000 individual oligonucleotide sequences were syn- 
thesized on the surface of a silicon wafer at distinct positions. The minimal length of oligomers is 12 nucleotides, 

30 preferable 25 nucleotides or full length of the questioned transcript. Expression profiling may also be carried out by 
hybridization to nylon or nitrocellulose membrane bound DNA or oligonucleotides. Detection of signals derived from 
hybridization may be obtained by either colorimetric, fluorescent, electrochemical, electronic, optic or by radioactive 
readout. Detailed description of array construction have been mentioned above and in other patents cited. To determine 
the quantitative and qualitative changes in the chromosomal region to analyze, RNA from tumor tissue which is sus- 

35 pected to contain such genomic alterations has to be compared to RNA extracted from benign tissue (e.g. epithelial 
breast tissue, or micro dissected ductal tissue) on the basis of expression profiles for the whole transcriptome. With 
minor modifications, the sample preparation protocol followed the Affymetrix GeneChip Expression Analysis Manual 
(Santa Clara, CA). Total RNA extraction and isolation from tumor or benign tissues, biopsies, cell isolates or cell con- 
taining body fluids can be performed by using TRIzol (Life Technologies, Rockville, MD) and Oligotex mRNA Midi kit 

40 (Qiagen, Hilden, Germany), and an ethanol precipitation step should be carried out to bring the concentration to 1 mg/ 
ml. Using 5-10 mg of mRNA to create double stranded cDNA by the Superscript system (Life Technologies). First 
strand cDNA synthesis was primed with a T7-(dT24) oligonucleotide. The cDNA can be extracted with phenol/chloro- 
form and precipitated with ethanol to a final concentration of 1 mg /ml. From the generated cDNA, cRNA can be syn- 
thesized using Enzo's (Enzo Diagnostics Inc., Farmingdale, NY) in vitro Transcription Kit. Within the same step the 

^5 cRN A can be labeled with biotin nucleotides Bio-11-CTP and Bio-16-UTP (Enzo Diagnostics Inc., Farmingdale, NY) . 
After labeling and cleanup (Qiagen, Hilden (Germany) the cRNA then should be fragmented in an appropriated frag- 
mentation buffer (e.g., 40 mM Tris-Acetate, pH 8.1, 100 mM KOAc, 30 mM MgOAc, for 35 minutes at 94°C). As per 
the Affymetrix protocol, fragmented cRNA should be hybridized on the HG_U133 arrays A and B, comprising app. 
40.000 probed transcripts each, for 24 hours at 60 rpm in a 45°C hybridization oven. After Hybridization step the chip 

50 surfaces have to be washed and stained with streptavidin phycoerythrin (SAPE; Molecular Probes, Eugene, OR) in 
Affymetrix fluidics stations. To amplify staining, a second labeling step can be introduced, which is recommended but 
not compulsive. Here one should add SAPE solution twice with an antistreptavidin biotinylated antibody. Hybridization 
to the probe arrays may be detected by fluorometric scanning (Hewlett Packard Gene Array Scanner; Hewlett Packard 
Corporation, Palo Alto, CA). 

55 [0353] After hybridization and scanning, the microarray images can be analyzed for quality control, looking for major 
chip defects or abnormalities in hybridization signal. Therefor either Affymetrix GeneChip MAS 5.0 Software or other 
microarray image analysis software can be utilized. Primary data analysis should be carried out by software provided 
by the manufacturer.. 
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[0354] In case of the genes analyses in one embodiment of this invention the primary data have been analyzed by 
further bioinformatic tools and additional filter criteria. The bioinformatic analysis is described in detail below. 

c) Data analysis 

[0355] According to Affymetrix measurement technique (Affymetrix GeneChip Expression Analysis Manual, Santa 
Clara, CA) a single gene expression measurement on one chip yields the average difference value and the absolute 
call. Each chip contains 1 6-20 oligonucleotide probe pairs per gene or cDN A clone. These probe pairs include perfectly 
matched sets and mismatched sets, both of which are necessary for the calculation of the average difference, or 
expression value, a measure of the intensity difference for each probe pair, calculated by subtracting the intensity of 
the mismatch from the intensity of the perfect match. This takes into consideration variability in hybridization among 
probe pairs and other hybridization artifacts that could affect the fluorescence intensities. The average difference is a 
numeric value supposed to represent the expression value of that gene. The absolute call can take the values 'A' 
(absent), 'M* (marginal), or *P* (present) and denotes the quality of a single hybridization. We used both the quantitative, 
information given by the average difference and the qualitative information given by the absolute call to identify the 
genes which are differentially expressed in biological samples from individuals with breast cancer versus biological 
samples from the normal population. With other algorithms than the Affymetrix one we have obtained different numerical 
values representing the same expression values and expression differences upon comparison. 
[0356] The differential expression E in one of the breast cancer groups compared to the normal population is calcu- 
lated as follows. Given n average difference values d 1f d 2 , .... d n in the breast cancer population and m average dif- 
ference values c 1f c 2r c m in the population of normal individuals, it is computed by the equation: 



If dj<50 or Cj<50 for one or more values of i and j, these particular values q and/or dj are set to an "artificial" expression 
value of 50. These particular computation of E allows for a correct comparison to TaqMan results. 
[0357] A gene is called up-regulated in breast cancer versus normal if E>1.5 and if the number of absolute calls 
equal to V* in the breast cancer population is greater than n/2. 

[0358] A gene is called down-regulated in breast cancer versus normal if E<1 .5 and if the number of absolute calls 
equal to 'P* in the normal population is "greater than m/2. 

[0359] The final list of differentially regulated genes consists of all up-regulated and all down-regulated genes in 
biological samples from individuals with breast cancer versus biological samples from the normal population. Those 
genes on this list which are interesting for a pharmaceutical application were finally validated by TaqMan. If a good 
correlation between the expression values/behavior of a transcript could be observed with both techniques, such a 
gene is listed in Tables 1 to 3. 

[0360] Since not only the information on differential expression of a single gene within an identified ARCHEON, but 
also the information on the co-regulation of several members is important for predictive, diagnostic, preventive and 
therapeutic purposes we have combined expression data with information on the chromosomal position (e.g. golden 
path) taken from public available databases to develop a picture of the overall transcriptom of a given tumor sample. 
By this technique not only known or suspected regions of genomes can be inspected but even more valuable, new 
regions of disregulation with chromosomal linkage can be identified. This is of value in other types of neoplasia or viral 
integration and chromosomal rearrangements. By SQL based database searches one can retrieve information on 
expression, qualitative value of a measurement (denoted by Affymetrix MAS 5.0 Software), expression values derived 
from other techniques than DNA-chip hybridization and chromosomal linkage. 

EXAMPLE 2 



Identification of the ARCHEON 

a) Identification and localization of genes or gene probes (represented by the so called probe sets on Affymetrix arrays 
HG-U95A-E or HG-U133A-B) in their chromosomal context and order on the human genome. 

[0361] For identification of larger chromosomal changes or aberrations, as they have been described in detail above, 
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a sufficient number of genes, transcripts or DNA-fragments is needed. The density of probes covering a chromosomal 
region is not necessarily limited to the transcribed genes, in case of the use of array based CGH but by utilizing RNA 
as probe material the density is given by the distance of genes on a chromosome. The DNA-microarrays provided by 
Affymetrix Inc. Do contain hitherto all transcripts from the known humane genome, which are be represented by 40.000 
5 - 60.000 probe sets. By BLAST mapping and sorting the sequences of these short DNA-oligomers to the public available 
sequence of the human genome represented by the so called "golden path", available at the university of California in 
Santa Cruz or from the NCBI, a chromosomal display of the whole Transcriptome of a tissue specimen evolves. By 
graphical display of the individual chromosomal regions and color coding of over or under represented transcripts, 
compared to a reference transcriptome regions with DNA gains and losses can be identified. 

10 

b) Quantification of gene copy numbers by combined IHC and quantitative PCR (PCR karyotyping) or directly by 
quantitative PCR 

[0362] Usually one to three paraffin-embedded tissue sections that are 5 u.m thick are used to obtain genomic DNA 
15 from the samples. Tissue section are stained by colorimetric IHC after deparaffinization to identify regions containing 
disease associated cells. Stained regions are macrodissected with a scalpel and transferred into a microcentrifuge 
tube. The genomic DNA of these isolated tissue sections is extracted using appropriate buffers. The isolated DNA is 
then used for quantitative PCR with appropriate primers and probes. Optionally the IHC staining can be omitted and 
the genomic DNA can be directly isolated with or without prior deparaffinization with appropriate buffers. Those who 
20 are skilled in the art may vary the conditions and buffers described below to obtain equivalent results. 

[0363] Reagents from DAKO (HercepTest Code No. K 5204) and TaKaRa were used (Biomedicals Cat.: 9091 ) ac- 
cording to the manufactures protocol. 

[0364] It is convenient to prepare the following reagents prior to staining: 
25 Solution No. 7 

[0365] Epitope Retrieval Solution (Citrate buffer + antimicrobial agent) (10xconc.) 20 ml ad 200 ml aqua dest. (stable 
fori month at 2-8° C ) 

30 Solution No. 8 

[0366] Washing-buffer (Tris-HCI + antimicrobial agent) (1 0 x cone.) 30 ml ad 300 ml destilled water (stable for 1 month 
at2-8°C) 

35 Staining solution: DAB 

[0367] 1 ml solution is sufficient for 10 slides. The solution were prepared immediately before usage.: 
[0368] 1 ml DAB buffer (Substrate Buffer solution, pH 7.5, containing H 2 0 2 , stabilizer, enhancers and an antimicrobial 
agent) + 1 drop (25-3 ut) DAB-Chromogen (3,3'-diaminobenzidine chromogen solution). This solution is stable for up 
40 to 5 days at 2-8°C. Precipitated substances do not influence the staining result. Additionally required are:2x approx. 
100 ml Xylol, 2x approx. 100 ml Ethanol 100%, 2x Ethanol 95%, aqua dest. These solution can be used for up to 40 
stainings. A water bath is required for the epitope retrieval step. 

Staining procedure: 

45 

[0369] All reagents are pre-warmed to room temperature (20-25°C) prior to immuno-staining. Likewise all incubations 
were performed at room temperature. Except the epitope retrieval which is performed in at 95°C water bath. Between 
the steps excess of liquid is tapped off from the slides with lintless tissue (Kim Wipe). 

so Deparaffinization 

[0370] Slides are placed in a xylene bath and incubated for 5 minutes. The bath is changed and the step repeated 
once. Excess of liquid is tapped off and the slides are placed in absolute ethanol for 3 minutes. The bath is changed 
and the step repeated once. Excess of liquid is tapped off and the slides are placed in 95% ethanol for 3 minutes. . The 
55 bath is changed and the step repeated once. Excess of liquid is tapped off and the slides are placed in distilled water 
for a minimum of 30 seconds. 



57 



EP 1 365 034 A2 



Epitope Retrival 

[0371] Staining jars are filled with with diluted epitope retrieval solution and preheated in a water bath at 95°C. The 
deparaffinized sections are immersed into the preheated solution in the staining jars and incubated for 40 minutes at 
5 95°C. The entire jar is removed from the water bath and allowed to cool down at room temperature for 20 minutes. 
The epitope retrieval solution is decanted, the sections are rinsed in distilled water and finally soaked in wash buffer 
for 5 minutes. 

Peroxidase Blocking: 

10 

[0372] Excess of buffer is tapped off and the tissue section encircled with a DAKO pen. The specimen is covered 
with 3 drops (100 uJ) Peroxidase-Blocking solution and incubated for 5 minutes. The slides are rinsed in distilled water 
and placed into a fresh washing buffer bath. 

15 Antibody Incubation 

[0373] Excess of liquid is tapped off and the specimen are covered with 3 drops (100 uJ) of Anti-Her-2/neu reagent 
(Rabbit Anti-Human Her2 Protein in 0.05 mol/L Tris/HCI, 0.1 mol/L NaCI, 15 mmol/L pH7.2 NaN 3 containing stabilizing 
protein) or negative control reagent (= IGG fraction of normal rabbit serum at an equivalent protein concentration as 
20 the Her2 Ab). After 30 minutes of incubation the slide is rinsed in water and placed into a fresh water bath. 

Visualization 

[0374] Excess of liquid is tapped off and the specimen are covered with 3 drops (100 uJ) of visualization reagent. 
25 After 30 minutes of incubation the slide is rinsed in water and placed into a fresh water bath. Excess of liquid is tapped 
off and the specimen are covered with 3 drops (100 uJ) of Substrate-Chromogen solution (DAB) for 10 minutes. After 
rinsing the specimen with distilled water, photographs are taken with a conventional Olympus microscope to document 
the staining intensity and tumor regions within the specimen. Optionally a counterstain with hematoxylin was performed. 

30 DNA extraction 

[0375] The whole specimens or dissected subregions are transferred into a microcentrifuge tubes. Optionally a small 
amount (10^1) of preheated TaKaRa solution (DEXPAT™) is preheated and placed onto the specimen to facilitate 
sample transfer with a scalpel. 50 to 150 uJ of TaKaRa solution were added to the samples depending on the size of 

35 the tissue sample selected. The sample are incubated at 100°C for 10 minutes in a block heater, followed by centrif- 
ugation at 12.000 rpm in a microcentrifuge. The supernatant is collected using a micropet and placed in a separate 
microcentrifuge tube. If no deparaffinization step has been undertaken one has to be sure not to withdraw tissue debris 
and resin. Genomic DNA left in the pellet can be collected by adding resin-free TaKaRa buffer and an additional heating 
and centrifugation step. Samples are stored at -20°C. 

40 [0376] Genomic DNA from different tumor cell lines (MCF-7, BT-20, BT-474, SKBR-3, AU-565, UACC-812, UACC- 
893, HCC-1008, HCC-2157, HCC-1954, HCC-2218, HCC-1937, HCC1599, SW480), or from lymphocytes is prepared 
with the QIAamp® DNA Mini Kits or the QIAamp® DNA Blood Mini Kits according to the manufacturers protocol. Usually 
between Ing up to 1ug DNA is used per reaction. 

45 Quantitative PCR 

. [0377] To measure the gene copy number of the genes within the patient samples the respective primer/probes (see 
table below) are prepared by mixing 25 \i\ of the 100 u.M stock solution "Upper Primer", 25 u.l of the 100 u.M stock 
solution "Lower Primer" with 12,5 uJ of the 100 jiM stock solution Taq Man Probe (Quencher Tamra) and adjusted to 

50 5 00 \i\ with aqua dest. For each reaction 1 ,25 \i\ DNA-Extract of the patient samples or 1 ,25 \i\ DNA from the cell lines 
■ were mixed with 8,75 u.l nuclease-free water and added to one well of a 96 Well-Optical Reaction Plate (Applied Bio- 
systems Part No. 4306737). 1 ,5 uJ Primer/Probe mix, 1 2, \i\ Taq Man Universal-PCR Mix (2x) (Applied Biosystems Part 
No. 4318157) and 1 *il Water are then added. The 96 well plates are closed with 8 Caps/Strips (Applied Biosystems 
Part Number 4323032) and centrifuged for 3 minutes. Measurements of the PCR reaction are done according to the 

55 instructions of the manufacturer with a TaqMan 7900 HT from Applied Biosystems (No. 20114) under appropriate 
conditions (2 min. 50°C, 10 min. 95°C. 0.15mm. 95°C, 1 min. 60°C; 40 cycles). SoftwareSDS 2.0 from Applied Bio- 
sysrtems is used according to the respective instructions. CT-values are then further analyzed with appropriate software 
(Microsoft Excel™). 
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6 


32 


AB02 1742.1 


322431 


4761 


NEUROD2 




7 


33 


NM_006804.1 


77628 


10948 


MLN64 




8 


34 


NM_003673.1 


111110 


8557 


TELETHONIN 


20 


9 


35 


NM_002686.1 1 


1892 


5409 


PNMT 


10 


36 


X03363.1 


323910 


2064 


ERBB2 




11 


37 


AB008790.1 


86859 


2886 


GRB7 




12 


38 


NM_002809.1 


9736 


5709 


PSMD3 


25 


13 


39 


NM_000759.1 


2233 


1440 


GCSFG 




14 


40 


AI023317 


23106 


9862 


KIAA0130 




15 


41 


X55005 




7067 


c-erbA-1 


30 


16 


42 


X72631 


211606 


9572 


NRID1 . 


17 


43 


NM_007359.1 


83422 


22794 


MLN51 




18 


44 


U77949.1 


69563 


990 


CDC6 




19 


45 


U41742.1 




5914 


RARA 


35 


20 


46 


NM_001 067.1 


156346 


7153 


TOP2A 




21 


47 


NMJ)01 552.1 


1516 




IGFBP4 




22 


48 


NM_001 838.1 


1652 




CCR7 EBI1 


40 


23 


49 


NMJD03079.1 


332848 


6605. 


SMARCE1 
BAF57 




24 


50 


X14487 


99936 


3858 


KRT10 




25 


51 


NM_000223.1 


66739 




KRT12 


45 


26 


52 


NMJ)02279.2 


32950 


3884 


hHKa3-ll 




53 


76 


NMJ)05937 


349196 


4302 


MLLT6 




54 


77 


XMJ308147 


184669 


7703 


ZNF144 


50 


55 


78 


NM_1 38687 


432736 


8396 


PIP5K2B 


56 


79 


NM_020405 


125036 


57125 


TEM7 




57 


80 


XM_012694 


258579 


22806 


ZNFN1A3 




58 


81 


XM_085731 


13996 


147179 


WIRE 


55 


59 


82 


NM_002795 


82793 


5691 


PSMB3 




.60 


83 


NM_033419 


91668 


93210 


MGC9753 
Variant a 
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Table 1 (continued) 



15 



20 



25 



30 



DNA SEQ ID 

NO: 


Protein SEQ ID 
NO: 


Genbank ID 


Unigene_v133JD 


Locus Link 
ID 


Gene Name 


61 


84 








MGC9753 
Variant c 


62 


85 








MGC9753 
Variant d 


63 


86 








MGC9753 
Variant e 


64 


87 








MGC9753 
Variant g 


D3 


RR 
oo 








Variant h 


66 


89 








MGC9753 
Variant i 


67 


90 


AF395708 


374824 


94103 


ORMDL3 


68 


91 


NM_032875 


194498 


84961 


MGC 15482 


69 


92 


NM_032192 


286192 


84152 


PPP1R1B 


70 


93 


NM_032339 


333526 


84299 


MGC14832 


71 


94 


NMJ)57555 


12101 


51242 


LOC51242 


72 


95 


NMJ)17748 


8928 


54883 


FLJ20291 


73 


96 


NM_018530 


19054 


55876 


Pro2521 


74 


97 


NM_016339 


118562 


51195 


Link-GEFII 


75 


98 


NM_032865 


294022 


84951 


CTEN 



Table 2 



40 



45 



50 



55 



DNA SEQ ID NO: 


Gene description 


1 


Member of a subfamily of LIM proteins that contains a LIM domain and an SH3 (Src homology 
region 3) domain 


2 


Beta 1 subunit of a voltage-dependent calcium channel (dihydropyridine receptor), involved in 
coupling of excitation and contraction in muscle, also acts as a calcium channel in various 
other tissues 


3 


Ribosomal protein L19, component of the large 60S ribosomal subunit 


4 . 


Protein with similarity to nuclear receptor-interacting proteins; binds and co-activates the 
nuclear receptors PPARalpha (PPARA), RARalpha (RARA), RXR, TRbetal, and VDR 


5 


we26e0 CDC2-related protein kinase 7 


6 


Neurogenic differentiation, a basic-helix-loop-helix transcription factor that mediates neuronal 
differentiation 


7 


Protein that is overexpressed in malignant tissues, contains a putative trans-membrane region 
and a StAR Homology Domain (SHD), may function in steroidogenesis and contribute to tumor 
progression 


8 


Telethonin, a sarcomeric protein specifically expressed in skeletal and heart muscle, caps titin 
(TTN) and is important for structural integrity of the sarcomere 



64 
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Table 2 (continued) 





DNA SEQ ID NO: 


Gene description 


5 


9 


Phenylethanolamine N-methyltransferase, acts in catecholamine biosynthesis to convert 
norepinephrine to epinephrine 




10 


Tyrosine kinase receptor that has similarity to the EGF receptor, a critical component of IL-6 
signaling through the MAP kinase pathway, overexpression associated with prostate, ovary 
and breast cancer 


10 


11 


Growth factor receptor-bound protein, an SH2 domain-containing protein that has isoforms 
which may have a role in cell invasion and metastatic progression of esophageal carcinomas 




12 


Non-ATPase subunit of the 26S proteasome (prosome, macropain) 


15 


13 


Granulocyte colony stimulating factor, a glycoprotein that regulates growth, differentiation, and 
survival of neutrophilic granulocytes 




14 


Member of the Vitamin D Receptor Interacting Protein co-activator complex, has strong 
similarity to thyroid hormone receptor-associated protein (murine Trap 100) which function as 
a transcriptional coregulator 


20 


15 


Thyroid hormone receptor alpha, a high affinity receptor for thyroid hormone that activates 
transcription; homologous to avian erythroblastic leukemia virus oncogene 




16 


encoding Rev-ErbAalp nuclear receptor subfamily 1 , group D, member 1 




17 


Protein that is overexpressed in breast carcinomas 


25 


18 


Protein which interacts with the DNA replication proteins PCNA and Orel , translocates from 
the nucleus following onset of S phase; S. cerevisiae homolog Cdc6p is required for initiation 
of S phase 


30 


19 


Retinoic acid receptor alpha, binds retinoic acid and stimulates transcription in a ligand- 
dependent manner 


20 


DNA topoisomerase II alpha, member of a family of proteins that relieves torsional stress 
created by DNA replication, transcription, and cell division; 


35 


91 


ln<2i ilin-likpi nrnwth faptr>r hinHinn nmtoin tho mainr If^f-RP of nQt^nhlaQt-likf* ppIIc hinrta IfnFI 

and IGF2 and inhibits their effects on promoting DNA and glycogen synthesis in osteoblastic 
cells 






HI IMFRHO^ nrntip»in-rv>i mloH rprontnr fPRI 1 ^ n^np aynn 1 rhf»mnkinp» (Cl-C*. mntift rf*rf*ninr 
n u ivi c_ o i i \j yjixjixsu ruuu|Jicu i tsucjJiui \cdi I / yciic caui i <j unci i iumi i<? yw-v/ i iiuiti^ i cvcpiui 

7 G protein -coupled receptor 


40 


23 


PmtPtn with an HM(^ 1/9 DNA-hinrltnn Hnmain that cmhnnit nf thp SNIF/SWI rnmnlpy 

I Iwlvlll Willi call 1 IIVlvJ \tC \~i 1 N r\ \J\\ \\X 1 1 ly UvH 1 IGMI 1 llicll IO ovJUUIIH \JI IIIC *_MNI (wVVI V^VJI 1 lyiCA 

associated with the nuclear matrix and implicated in regulation of transcription by affecting 
chromatin structure 


45 


24 


Keratin 10, a type I keratin that is a component of intermediate filaments and is expressed in 
terminally differentiated epidermal cells; mutation of the corresponding gene causes 
epidermolytic hyperkeratosis 




25 


Keratin 12, a component of intermediate filaments in corneal epithelial cells; mutation of the 
corresponding gene causes Meesmann corneal dystrophy 


50 




Uoir l/aroli n o h/na ( Unrolin ihoi i o o mamhar i~\f o fom r\f etni^titrol nrntoinc that fnrm 

nair Keraun od, a type i Keraun mai is a memoer 01 a lamuy 01 structural pruiuino uidi lurin 
intermediate filaments 




53 


MLLT6 Myeloid/lymphoid or mixed-lineage leukemia (trithorax homolog, Drosophila); 
translocated to, 6 




54 


zinc finger protein 144 (Mel-1 8) 


55 


55 


phosphatidylinositol-4-phosphate 5-kinase type II beta isoform a 




56 


tumor endothelial marker 7 precursor 
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Table 2 (continued) 



DNA SEQ ID NO: 


Gene description 


57 


zinc finger protein, subfamily 1 A, 3 


58 


WASP-binding protein putative cr16 and wip like protein similar to Wiskott-Aldrich syndrome 
protein 


59 


proteasome (prosome, macropain) subunit, beta type, 3 


60 


Predicted 


67 


ORMI-like 3 (S. cerevisiae) 


68 


F-box domain A Receptor for Ubiquitination Targets 


69 

■ 


protein phosphatase 1, regulatory (inhibitor) subunit 1B (dopamine and cAMP regulated 
phosphoprotein, DARPP-32) 


70 


Predicted Protein 


71 


Predicted Protein 


72 


Predicted Protein 


73 


Predicted Protein 


74 


Link-GEFII: Link guanine nucleotide exchange factor II 


75 


C-terminal tensin-like 
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Table 4 



5 


DNA 

SEQ ID NO: 


Protein 
SEQ ID NO: 


Gene Name 


DBSN P,D 


Type 


Codon 


AA-Seq 




9 


34 


ERBB2 


rs2230698 


coding-synon 


TCAjTCG 


SIS 
w i v 




9 


34 


ERBB2 


rs2230700 


noncoding 








9' 


34 


ERBB2 


rs1 058808 


coding-nonsynon 


CCCjGCC 


PIA 


10 


9 


34 


ERBB2 


rs1801200 


noncoding 








9 


34 


ERBB2 


rs903506 


noncoding 








9 


34 


ERBB2 


rs2313170 


noncoding 






15 


9 


34 


ERBB2 


rs1 136201 


coding-nonsynon 


ATC|GTC 


l|V 


9 


34 


ERBB2 


rs2934968 


noncoding 








9 


34 


ERBB2 


rs21 72826 


noncoding 








9 


34 


ERBB2 


rs1810132 


coding-nonsynon 


ATC|GTC 


IJV 


20 


9 


34 


ERBB2 


rs1801201 


noncoding 








14 


39 


c-erbA-1 


rs2230702 


coding-synon 


TCC|TCT 


SIS 




14 


39 


c-erbA-1 


rs2230701 


coding-synon 


GCC|GCT 


A|A 


25 


14 


39 


c-erbA-1 


rs1 126503 


coding-nonsynon 


ACCjAGC 


T|S ! 




14 


39 


c-erbA-1 


rs3471 


noncoding 








19 


44 


TOP2A 


rs13695 


noncoding 








19 


44 


TOP2A 


rs471692 


noncoding 






30 


19 


44 


TOP2A 


rs558068 


noncoding 








19 


44 


TOP2A 


rs 1064288 


noncoding 








19 


44 


TOP2A 


rs1061692 


coding-synon 


GGA|GGG 


GIG \ 


35 


19 


44 


TOP2A 


rs520630 


noncoding 








19 


44 


TOP2A 


rs782774 


coding-nonsynon 


AAT|ATT|ATT|TTT 


N|I|I|F 




19 


44 


TOP2A 


rs565121 


noncoding 








19 


44 


TOP2A 


TS2586112 


noncoding 






40 


19 


44 


TOP2A 


rs532299 


coding-nonsynon 


TTTjGTT 


F|V 




19 


44 


TOP2A 


rs2732786 


noncoding 








19 


44 


TOP2A 


rs 1804539 


noncoding 






45 


19 


44 


TOP2A 


rs 1804538 


noncoding 








19 


44 


TOP2A 


rs1 804537 


noncoding 








19 


44 


TOP2A 


rs1141364 


coding-synon 


AAA|AAG 


K|K 




23 


48 


KRT10 


rs12231 


noncoding 






50 


23 


48 


KRT10 


rs 11 32259 


coding-nonsynon 


CATjCGT 


H|R 




23 


48 


KRT10 


rs1 132257 


coding-synon 


CTGjTTG 


L|L 




23 


48 


KRT10 


rs1 132256 


coding-synon 


,GCC|GCT 


A|A 


55 


23 


48 


KRT10 


rs1 132255 


coding-synon 


CTG|TTG 


ML 




23 


48 


KRT10 


rs1 132254 


coding-synon 


GGC|GGT 


G|G 




23 


.48 


KRT10 


rs1132252 


coding-synon 


TTC|TTT 


F|F 
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Table 4 (continued) 



DNA 
SEQIDNO: 


Protein 
SEQ ID NO: 


Gene Name 


OBSN PID 


Type 


Codon 


AA-Seq 


23 


48 


KRT10 


rs11 32268 


coding-nonsynon 


CAGJGAG 


QiE 


23 


48 


KRT10 


rs1132258 


coding-nonsynon 


CGG|TGG 


R|W 
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CO 


SEQUENCE 


GGGATTGTTTCGCCACACATA 


CCGATGTTAAGGCCCATAGC 


TAAAATGTCCGGCCAACATGAGTTCCC 


CGCAGTGCCTGGCACAT 


GACACCCCCTGACCTATGGA 


CAGTGACCTCTCCCGTTCCCTTGGA 


TGGGTCCCTGTGTCCTCTTC 


AGGGTCAGGAGGGAGAAAAC 


CCAGTGCCCACCCGTTAAAGAGTCAA 


TTGTGGGACACTCAGTAACTTTGG 


ACAAGCACTCCCACCGAGAT 


AGTCTGTCCTCACTGCCATCGCCA 


AAGCCTCT.GGGTTTTCCCTTT 


CCCACTGGTGACAGGATGGT 


CATCTGACATCTTTCCCGTGGAG 


CTTTGCACGATGTCTCAACCA 


TTTCCCGTGGAGCAGGAA 


' CCGCCGCCTAATATGCAACATTAGGG 


' CGAGTATTCCAAAGCTGGTATCG . 


' ATCACAGAGAGATGGCCCTTATCT 




to 


to 


FAM5' 


wo 


to 


FAM5' 


to 




FAM 5' 


to 




FAM 5' 


»o 


to 


FAM 5' 


to 


to 


i FAM 5' 


to 


to 


PRIMER 


PPPIRIB FOR 


PPPIRIB REV 


MGC14832 


MGC14832 FOR 


MGC 14832 REV 


LOC51242 


LOC5 1242 FOR 


LOC5 1242 REV 


FLJ20291 


FLJ20291 FOR 


FLJ20291 REV 


PR02521 


PR02S21 FOR 


PR02521 REV 


Link-GEFO 


Link-GEFO FOR 


Link-GEFO REV 


CTEN 


CTEN FOR 


CTENREV 
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SEQUENCE LISTING 

<110> Bayer AG 

<120> METHODS AND COMPOSITIONS FOR THE PREDICT: 
PREVENTION AND TREATMENT OF MALIGNANT NEOPLASIA 

<130> LgA 36108.1 EP 

<150> EP02010291.9 
<151> 2002-05-21 

<160> 314 

<170> Patentln version 3.1 



<210> 1 

<211> 3846 

<212> DNA 

<213> Homo sapiens 



<400> 1 

gcctcccgcc agctcgcctc ggggaacagg acgcgcgtga gctcaggcgt ccocgcccca 60 

gcttttctcg gaaccatgaa cccoaactgc gcccggtgcg gcaagatcgt gtatoccacg 120 

gagaaggtga actgtctgga taagttctgg cataaagcat gcttcoattg cgagacctgc 180 

aagatgacac tgaacatgaa gaactacaag ggctacgaga agaagcccta ctgcaacgca 240 

cactacccoa agcagtcctt caccatggtg gcggacaccc cggaaaacct tcgcctcaag 300 

caacagagtg agatcoagag tcaggtgcgc tacaaggagg agtttgagaa gaacaagggc 360 

aaaggtttca gcgtagtggo agacacgccc gagotccaga gaatcaagaa gaccoaggac 420 

cagatcagta atataaaata ccatgaggag tttgagaaga gccgcatggg ccctagcggg 480 

ggcgagggca tggagccaga gcgtcgggat tcacaggacg gcagcagcta ccggcggccc 540 

ctggagcagc agcagcctca ccacatcccg accagtgccc cggtttacca gcagccccag 600 

cagcagccgg tggcccagtc ctatggtggc tacaaggagc ctgcagcccc agtctccata 660 

cagcgcagcg ccccaggtgg tggcgggaag cggtaccgcg cggtgtatga ctacagcgcc 720 

gcogacgagg acgaggtctc cttccaggac ggggacacca tcgtcaacgt gcagcagatc 780 

gacgacggct ggatgtacgg gacggtggag cgcaccggcg acacggggat gctgccggcc 840 

aactacgtgg aggccatctg aacccggagc gcccccatct gtcttcagca cattccacgg 900 

catcgcatcc gtcctgggcg tgagccgtcc attcttcagt gtotctgttt tttaaaacct 960 
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gcgacagctt gtgattccta cccctcttcc agcttctttt gccaactgaa gccttcttct 1020 

gccacttctg cgggctccct cctctggcag gcttcccccg tgatcgactt cttggttttc 1080 

tctctggatg gaacgggtat gggcctctct gggggaggca gggctggaat gggagacctg 1140 

5 ttggcctgtg ggcctcaoot gcccctctgt tctctcccct cacatcctcc tgcccagctc 1200 

otcacatacc cacacattco agggctgggg tgagcctgac tgccaggacc ccaggtcagg 1260 

ggctccctac attccccaga gtgggatcca cttcttggtt cctgggatgg cgatggggac 1320 

tctgccgctg tgtagggacc agtgggatgg gctctacctc tctttctcaa agagggggct 1380 

ctgcccacct ggggtctctc tccctacctc cctcctcagg ggcaacaaca ggagaatggg 1440 

10 gttcctgctg tggggcgaat tcatcccctc cccgcgcgtt ccttcgcaca ctgtgatttt 1500 

gccctcctgc ccacgcagac ctgcagcggg caaagagctc ccgaggaagc acagcttggg 1560 

tcaggttctt gcctttctta attttaggga cagctaccgg aaggagggga acaaggagtt 1620 

ctcttccgca gcccctttcc ccacgcccac ccccagtctc cagggaccct tgcctgcctc 1680 

ctaggctgga agccatggtc ccgaagtgta gggcaagggt gcctcaggac cttttggtot 1740 

tcagcctccc tcagccccca ggatctgggt taggtggccg ctcctccctg ctcctcatgg 1800 

15 gaagatgtct cagagccttc catgacctco cctccccagc ccaatgooaa gtggacttgg 1860 

" ~ " 1920 



20 



25 



35 



40 



50 



<210> 2 

<211> 1711 

<212> DNA 

55 <213> Homo sapiens 



agotgcacaa agtcagoagg gaccactaaa tctccaagao ctggtgtgcg gaggcaggag 

catgtatgtc tgcaggtgtc tgacacgcaa gtgtgtgagt gtgagtgtga gagatggggo 1980 

gggggtgtgt ctgtaggtgt ctctgggcct gtgtgtgggt ggggttatgt gagggtatga 2040 

agagctgtct tcccctgaga gtttcctoag aacccacagt gagaggggag ggctcctggg 2100 

gcagagaagt tccttaggtt ttctttggaa tgaaattcct ccttcccccc atctctgagt 2160 

ggaggaagcc caccaatctg ccctttgcag tgtgtcaggg tggaaggtaa gaggttggtg 2220 

tggagttggg gctgccatag ggtctgcagc ctgctggggc taagcggtgg aggaaggcto 2280 

tgtcactcca ggcatatgtt tccccatctc tgtotggggc tacagaatag ggtggcagaa 2340 

gtgtcaccct gtgggtgtct coctcggggg ctcttcccct agacctcccc ctcacttaca 2400 

taaagctccc ttgaagcaag aaagagggtc ccagggctgc aaaactggaa gcacagcctc 2460 

ggggatgggg agggaaagac ggtgctatat ccagttcctg ctctctgctc atgggtggct 2520 

gtgacaaccc tggcctcact tgattcatct ctggttttct tgccaccctc tgggagtccc 2580 

catoccattt tcatcctgag cccaaccagg ccctgccatt ggcctcttgt cccttggcac 2640 

acttgtaccc acaggtgagg ggcaggacct gaaggtattg gcctgttcaa caatcagtca 2700 

tcatgggtgt ttttgtcaao tgcttgttaa ttgatttggg gatgtttgcc ccgaatgaga 2760 

30 ggttgaggaa aagactgtgg gtggggaggc cctgcctgac ccatcccttt tcctttctgg 2820 

ccccagccta ggtggaggca agtggaatat cttatattgg gcgatttggg ggctcgggga 2880 

ggcagagaat ctcttgggag tottgggtgg cgctggtgca ttctgtttcc tcttgatctc 2940 

aaagcacaat gtggatttgg ggaccaaagg tcagggacac atccccttag aggacctgag 3000 

tttgggagag tggtgagtgg aagggaggag cagcaagaag cagrcctgttt toaotcagct 3060 

taattotcct tcccagataa ggcaagccag tcatggaatc ttgctgcagg ccctccctct 3120 

actcttcctg tcctaaaaat aggggccgtt ttcttacaca cccccagaga gaggagggac 3180 

tgtcacaotg gtgctgagtg accgggggot gctgggcgtc tgttctttac caaaaccatc 3240 
catccctaga agagcacaga gccctgaggg gctgggctgg gctgggctga goccctggtc 
ttctctaoag ttcacagagg totttcagct catttaatcc oaggaaagag gcatcaaagc 
tagaatgtga atataacttt tgtgggcoaa tactaagaat aacaagaagc ccagt^rgtga 

ggaaagtgcg ttctoccagc actgcctcct gttttctccc tctcatgtcc ctccagggaa 3480 

aatgacttta ttgcttaatt tctgoctttc ccccctoaca catgcacttt tgggcctttt 3540 

tttatagctg gaaaaaacaa aataccacco tacaaacctg tatttaaaaa gaaacagaaa 3600 

tgaccacgtg aaatttgoct ctgtccaaac atttcatccg tgtgtatgtg tatgtgtgtg 3660 

agtgtgtgaa gccgccagtt catottttta tatggggttg ttgtctcatt ttggtctgtt 3720 

45 ttggtcccct ccctcgtggg cttgtgctcg ggatcaaacc tttctggcct gttatgattc 3780 

tgaacatttg acttgaacca caagtgaatc tttctcctgg tgactcaaat aaaagtataa 
ttttta 



3300 
3360 
3420 



3840 
3846 
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25 



35 



40 



45 



50 



55 



<400> 2 

gagggaaggc aggaaggagg cagccgaagg ccgagctggg tggotggacc gggtgotggc 60 

tgcgcgcgct gctttcggct cccacggcct ctcccatgcg ctgagggagc ccggctgcgg 120 

gccggcggcg ggaggggagg ctcctctcca tggtccagaa gaccagcatg tcccggggcc 180 



240 
300 
360 



ottacocacc ctcccaggag atccccatgg aggtcttcga ccccagcccg cagggcaaat 
acagcaagag gaaagggcga ttcaaacggt cagatgggag cacgtcctcg gataccacat 
ccaacagctt tgtccgccag ggctcagcgg agtcctacac cagccgtcca tcagactctg 

atgrtatctct ggaggaggac cgggaagcct taaggaagga agcagagcgc caggcattag 420 

ogcagctcga gaaggccaag accaagccag tggcatttgc tgtgcggaca aatgttggct 480 

aoaatccgtc tccaggggat gaggtgcctg tgcagggagt ggccatcacc ttcgagccca 540 

aagacttcct gcacatcaag gagaaataca ataatgactg gtggatcggg cggctggtga 600 

15 aggagggctg tgaggttggc ttcattccca gccccgtcaa actggacagc cttcgcctgc 660 

tgoaggaaca gaagctgcgc cagaaccgcc tcggctccag caaatcaggc gataactcca 720 

gttccagtct gggagatgtg gtgactggca occgccgcco cacacccoct gccagtgcca 780 

aacagaagca gaagtcgaca gagcatgtgc ccacctatga cgtggtgcct tccatgaggc 840 

ccatcatcct ggtgggaccg tcgctcaagg gctacgaggt tacagacatg atgoagaaag 900 

20 ctttatttga cttcttgaag oatcggtttg atggcaggat ctcoatcact cgtgtgacgg 960 

cagatatttc cctggctaag cgotcagtto tcaacaaccc cagcaaacac atcatcattg 1020 

agcgctccaa caoacgctcc agcctggctg aggtgcagag tgaaatcgag cgaatcttcg 1080 

agctggcccg gaoccttcag ttggtcgotc tggatgotga caeca tcaat cacccagccc 1140 

agctgtccaa gacctcgotg gcccccatca ttgtttacat caagatcacc tctcccaagg 1200 

tacttcaaag gctcatcaag tcccgaggaa agtctcagtc caaaoacctc aatgtccaaa 1260 

tageggcote ggaaaagctg gcacagtgcc cccctgaaat gtttgacatc atcctggatg 1320 

agaaccaatt ggaggatgee tgegagcate tggcggagta ettggaagee tattggaagg 1380 

ccacacaccc gcccagcagc acgccaccca atccgctgct gaaccgcacc atggctaccg 1440 

cagccctgog ccgtagccct gcccctgtct ccaacctcca ggtacaggtg ctcacctcgc 1500 

taaggagaaa cctcggcttc tggggeggge tggagtcctc acagegggge agtgtggtgc 1560 

30 cacaggagca ggaacatgco atgtagtggg cgccctgcco gtcttccotc ctgctotggg 1620 

gtcggaactg gagtgcaggg aacatggagg aggaagggaa gagctttatt ttgtaaaaaa 1680 

ataagatgag eggcaaaaaa aaaaaaaaaa a 1711 



<210> 3 

<211> 698 

<212> DNA 

<213> Homo sapiens 



<400> 3 

ttttcctttc getgetgegg ccgcagccat gagtatgetc aggcttcaga agaggctege 60 

ctctagtgtc ctccgctgtg gcaagaagaa ggtctggtta gaccccaatg agaccaatga 120 

aatcgccaat gccaactccc gtcagcagat ccggaagctc atcaaagatg ggctgatcat 180 

ccgcaagcct gtgaeggtec attcccgggc tegatgeegg aaaaacacct tggcccgccg 240 

gaagggcagg cacatgggca taggtaagcg gaagggtaca gccaatgccc gaatgecaga 300 

gaaggtcaca tggatgagga gaatgaggat tttgegcegg ctgctcagaa gataccgtga 360 
atctaagaag atcgatcgcc acatgtatca cagcctgtac ctgaaggtga aggggaatgt 
gttcaaaaac aageggatte tcatggaaca catccacaag ctgaaggcag acaaggcccg 



420 
480 



caagaagctc ctggctgacc aggctgaggc ccgcaggtct aagaccaagg aagcacgcaa 540 

gcgccgtgaa gagcgcctcc aggecaagaa ggaggagatc atcaagactt tatccaagga 600 

ggaagagacc aagaaataaa acctcccact ttgtctgtac atactggect ctgtgattac 660 

atagatcagc cattaaaata aaacaagect taatctgc 698 
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10 



35 



<210> 4 

<211> 5810 

<212> DNA 

<213> Homo sapiens 



acccctgtat gaactgatca ctcagtttga gctatcaaag gaccctgacc coataccttt 
gaatcacaac atgagatttt atgctgctot tcctggtcag cagcactgct atttcctcaa 
caaggatgct cctcttccag atggccgaag tctacaggga acccttgtta gcaaaatcac 
ctttcagcac cctggccgag ttcctcttat cctaaatctg atcagacacc aagtggccta 
taacaccctc attggaagct gtgtcaaaag aactattotg aaagaagatt ctcctgggct 
tctccaattt gaagtgtgtc ctctctcaga gtctcgtttc agcgtatctt ttcagcacco 
tgtgaatgao tccctggtgt gtgtggtaat ggatgtgcag ggcttaacac atgtgagctg 
taaactctac aaagggctgt cggatgcact gatctgcaca gatgaottca ttgccaaagt 
tgttcaaaga tgtatgtcca tccotgtgac gatgagggct attcggagga aagctgaaao 
cattcaagcc gacaccccag cactgtccct cattgcagag acagttgaag acatggtgaa 
aaagaacctg cccccggcta gcagcccagg gtatggcatg accacaggca acaacccaat 

0 i j +r*tv*aaaaa\L accattacca ccttatttaa 1920 

1980 
2040 
2100 



aaat^cuiuw^ wuwww^^ww« w>»w~^^ j w ^ — ■ — ■ 
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<400> 4 

gggaagatgg cggcggcctc gagcaccctc ctcttcttgc cgccggggac ttcagattga 
tccttcccgg gaagagtagg gactgctggt gocctgcgtc ccgggatccc gagccaactt 
gtttcctccg ttagtggtgg ggaagggctt atccttttgt ggcggatcta gottotcctc 
15 gcottcagga tgaaagctca ggggggaaac cgaggagtca gaaaagctga gtaagatgag 

ttctctcctg gaacggctcc atgoaaaatt taaccaaaat agaccctgga gtgaaaccat 
taagcttgtg cgtcaagtca tggagaagag ggttgtgatg agttctggag ggcatcaaca 360 
tttggtcagc tgtttggaga cattgcagaa ggctctcaaa gtaacatctt taccagcaat 420 
gact^atcgt ttggagtcca tagcaggaca gaatggaotg ggctctoatc tcagtgccag 480 
tggcactgaa tgttacatca cgtcagatat gttotatgtg gaagtgcagt tagatcctgc 540 
20 aggacagctt tgtgatgtaa aagtggctca ccatggggag aatcctgtga gctgtccgga 600 

gcttgtacag cagctaaggg aaaaaaattc tgatgaattt tctaagoaoc ttaagggcct 
tgttaatctg tataaccttc caggggacaa caaactgaag actaaaatgt acttggctct 
ccaatcctta gaacaagata tttctaaaat ggcaattatg tactggaaag caaotaatgc 
tggtccottg gataagattc ttcatggaag tgttggctat ctcacaccaa ggagtggggg 840 
25 tcatttaatg aacctgaagt actatgtotc tcottctgac ctactggatg acaagactgc 900 

atotoccatc attttgcatg agaataatgt ttctcgatct ttgggcatga atgcatcagt 960 
gaoaattgaa ggaacatctg ctgtgtaoaa actcccaatt gcaccattaa ttatggggtc 
acatecagtt gacaataaat ggaccocttc cttctcctca atcaccagtg ccaacagtgt 
tgatcttcct gcotgtttct tcttgaaatt tccccagcca atcccagtat ctagagcatt 
tgttcagaaa ctgcagaact gcacaggaat tocattgttt gaaactcaac caacttatgc 

30 *- ~4-~r,~4-4-*-«a rrnfjiteaaaa OACCCtaaCC COataCCttt 1260 
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1620 
1680 
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1860 
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1200 



tatgagoatg agcatcaaag atoggcatga gtcggtgggc catggggagg acttcagoaa 
ggtgtctcag aacccaatto ttaccagttt gttgcaaatc acagggaacg gggggtctac 
cattggctcg agtccgaccc otcctcatca cacgccgcca cctgtctctt cgatggccgg 

caacaccaag aaccacccga tgctcatgaa ccttctcaaa gataatcctg cccaggattt 2160 

45 ctcaaccctt tatggaagca gcoctttaga aaggcagaac tcctottccg gctcaccccg 2220 

catggaaata tgctcgggga gcaacaagac caagaaaaag aagtcatcaa gattaccaca 2280 

tgagaaacca aagcaccaga ctgaagatga ctttcagagg gagotatttt caatggatgt 2340 

tgactcacag aaccctatct ttgatgtcaa catgacagct gacacgctgg atacgccaca 2400 

catcactcca gctccaagcc agtgtagcac tcccccaaca acttacccac aaccagtacc 2460 

so tcacccccaa cccagtattc aaaggatggt ccgactatcc agttcagaca gcattggccc 2520 

agatgtaact gacatccttt cagacattgc agaagaagct tctaaacttc ccagcactag 2580 

tgatgattgc ccagccattg gcacccctct tcgagattct tcaagctctg ggcattctca 2640 

gagtaccctg tttgactotg atgtctttca aactaacaat aatgaaaatc catacactga 2700 

tccagctgat cttattgcag atgctgctgg aagccccagt agtgactctc ctaccaatca 2760 

tttttttcat gatggagtag atttcaatcc tgatttattg aacagccaga gccaaagtgg 2820 

55 ttttggagaa gaatattttg atgaaagcag ccaaagtggg gataatgatg atttcaaagg 2880 
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atttgcatct caggcactaa atactttggg ggtgccaatg cttggaggtg ataatgggga 2940 

gaccaagttt aagggcaata accaagccga cacagttgat ttcagtatta tttcagtagc 3000 

cggcaaagct tfcagctcctg cagatcttat ggagcatcac agtggtagtc agggtccttt 3060 

actgaccact ggggaottag ggaaagaaaa gactcaaaag agggtaaagg aaggcaatgg 3120 

caccagtaat agtactctct cggggcccgg attagacagc aaaccaggga agcgcagtcg 3180 

gaccccttct aatgatggga aaagcaaaga taagcctcca aagcggaaga aggcagacac 3240 

tgagggaaag tctccatctc atagttcttc taacagacct tttaccccac ctaccagtac 3300 

aggtggatct aaatcgccag gcagtgcagg aagatctcag actcccccag gtgttgccaa 3360 

accacccatt cccaaaatca ctattcagat tcctaaggga acagtgatgg tgggcaagcc 3420 

10 ttcctctcac agtcagtata coagcagtgg ttctgtgtct tcctcaggca gcaaaagcca 3480 

ccatagccat tcttcctcct cttcctcatc tgcttccacc tcagggaaga tgaaaagcag 3540 

taaatcagaa ggttcatcaa gttccaagtt aagtagcagt atgtattcta gcoaggggtc 3600 

ttatggatct agccagtcca aaaattcatc coagtctggg gggaagccag gctcctctcc 3660 

cataaccaag catggaotga gcagtggctc tagcagoacc aagatgaaac otcaaggaaa 3720 

gocatcatca cttatgaatc cttctttaag taaaooaaac atatcccott ctcattcaag 3780 

gccacctgga ggctctgaca agcttgcctc tccaatgaag cctgttcotg gaactcctcc 3840 

atcctctaaa gcoaagtccp ctatcagttc aggttctggt ggttctcata tgtotggaac 3900 

tagttcaagc tctggoatga agtcatcttc agggttagga tcctcaggot cgttgtccca 3960 

gaaaactccc ccatcatcta attcctgtao ggcatcttcc tcctcotttt cctcaagtgg 4020 

ctcttccatg tcatcctctc agaaccagca tgggagttct aaaggaaaat otcccagcag 4080 

aaacaagaag ccgtccttga cagctgtcat agataaactg aagcatgggg ttgtcacoag 4140 

tggccctggg ggtgaagacc cactggacgg ccagatgggg gtgagcacaa attcttccag 4200 

acatcctatg tcctcaaaao ataacatgtc aggaggagag tttcagggca agcgtgagaa 4260 

aagtgataaa gacaaatcaa aggtttccac ctccgggagt tcagtggatt cttctaagaa 4320 

gacctcagag tcaaaaaatg tggggagcac aggtgtggca aaaattatca tcagtaagoa 4380 

tgatggaggc tcccctagca ttaaagccaa agtgactttg cagaaacctg gggaaagtag 4440 

tggagaaggg cttaggcctc aaatggcttc ttctaaaaac tatggctctc cactcatcag 4500 

tggttccact ccaaagcatg agcgtggctc tcccagccat agtaagtcac cagcatatac 4560 

cccccagaat ctggacagtg aaagtgagtc aggctcctca atagcagaga aatottatca 4620 

gaatagtccc agctcagacg atggtatccg accacttcca gaatacagca cagagaaaca 4680 

taagaagcac aaaaaggaaa agaagaaagt aaaagacaaa gatagggaco gagaccggga 4740 

30 caaagaccga gacaagaaaa aatctcatag catcaagcca gagagttggt coaaatcacc 4800 

catctcttca gaccagtcct tgtctatgaa aagtaacaca atcttatctg cagacagacc 4860 

ctcaaggctc agcccagact ttatgattgg ggaggaagat gatgatctta tggatgtggo 4920 
cctgattggg aattaggaac cttatttcct aaaagaaaca gggccagagg aaaaaaaaot 
attgataagt ttataggcaa aooaccataa ggggtgagrtc agacaggtct gatttggtta 
agaatcotaa atggcatggc tttgacatca agctgggtga attagaaagg catatcoaga 

ccotattaaa gaaaccacag ggtttgatta tggttaccag gaagtcttct ttgttcctgt 5160 

5220 



15 



20 



25 



35 
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gccagaaaga aagttaaaat acttgcttaa gaaagggagg ggggtgggag gggtgtaggg 

agagggaagg gagggaaaca gttttgtggg aaatattcat atatattttc ttctcccttt 5280 

ttccattttt aggccatgtt ttaaaotcat tttagtgcat gtatatgaag ggctgggcag 5340 

aaaatgaaaa agcaatacat tccttgatgc atttgcatga aggttgttca actttgtttg 5400 

40 aggtagttgt ccgtttgagt catgggcaaa tgaaggactt tggtcatttt ggacacttaa 5460 

gtaatgtttg gtgtctgttt cttaggagtg actgggggag ggaagattat tttagctatt 5520 

tatttgtaat attttaacco tttatctgtt tgtttttata cagtgtttcg ttctaaatct 5580 

atgaggttta gggttcaaaa tgatggaagg ccgaagagca aggcttatat ggtggtaggg 5640 

agcttatagc ttgtgctaat actgtagcat caagcccaag caaattagtc agagcccgcc 5700 

45 tttagagtta aatataatag aaaaaccaaa atgatatttt tattttagga gggtttaaat 5760 

agggttcaga gatcatagga atattaggag ttacctctct gtggaggtat 5810 

<210> 5 

50 <2H> 5515 

<212> DNA 

<213> Homo sapiens 

55 • 



81 



EP 1 365 034 A2 



<400> 5 

cttttttccc ttcttcaggt caggggaaag ggaatgccca attcagagag acatgggggc 60 

aagaaggacg ggagtggagg agcttctgga actttgcagc cgtcatcggg aggcggcagc 120 

tctaacagca gagagcgtca ccgcttggta tcgaagcaca agcggcataa gtccaaacac 180 

tccaaagaca tggggttggt gacccocgaa gcagcatccc tgggcacagt tatcaaacct 240 

ttggtggagt atgatgatat cagctctgat tccgacacct tctccgatga catggccttc 300 

aaactagaco gaagggagaa cgacgaacgt cgtggatcag atcggagoga ccgcctgcac 360 

10 aaacatcgtc accaccagca caggcgttoc cgggacttac taaaagctaa acagaccgaa 420 

aaagaaaaaa gccaagaagt ctccagcaag tcgggatcga tgaaggaccg gatatcggga 480 

agttcaaagc gttcgaatga ggagactgat gactatggga aggogcaggt agccaaaagc 540 

agcagcaagg aatccaggtc atccaagctc cacaaggaga agaccaggaa agaacgggag 600 

otgaagtctg ggoacaaaga coggagtaaa agtcatcgaa aaagggaaaa acacaaaagt 660 

15 tacaaaacag tggacagcoc aaaacggaga tccaggagco cccacaggaa gtggtctgac 720 

agctocaaac aagatgatag cccctcggga gcttottatg gccaagatta tgacottagt 780 

ccctcacgat otcatacctc gagcaattat gaotootaoa agaaaagtco tggaagtaco 840 

tcgagaaggc agtcggtcag tcccccttac aaggagcctt cggcctacca gtooagcacc 900 

cggtcacoga goccctacag taggcgacag agatotgtca gtccctatag caggagacgg 960 

tcgtccagct acgaaagaag tggatcttao agcgggogat ogcccagtoc otatggtcga 1020 

20 aggcggtcca gcagcccttt cctgagcaag cggtctctga gtcggagtcc actccccagt 1080 

aggaaatcoa tgaagtccag aagtagaagt cctgcatatt caagacattc atcttctcat 1140 

agtaaaaaga agagatcoag ttcacgcagt cgtcattcca gtatctcacc tgtcaggctt 1200 

ccacttaatt ccagtctggg agatgaactc agtaggaaaa agaaggaaag agcagctgot 1260 

gctgctgcag caaagatgga tggaaaggag tccaagggtt cacctgtatt tttgcctaga 1320 

25 aaagagaaca gttcagtaga ggctaaggat tcaggtttgg agtctaaaaa gttacccaga 1380 

agtgtaaaat tggaaaaatc tgccccagat actgaactgg tgaatgtaac acatctaaac 1440 

acagaggtaa aaaattcttc agatacaggg aaagtaaagt tggatgagaa ctccgagaag 1500 

catcttgtta aagatttgaa agcacaggga acaagagact ctaaacccat agcactgaaa 1560 

gaggagattg ttactccaaa ggagacagaa aoatcagaaa aggagacccc tccaootctt 1620 

cccacaattg cttctccccc acoccctcta ccaactacta cccctccacc tcagacaccc . 1680 

30 cctttgccac ctttgoctcc aataccagct cttccacagc aaccacctct gcctoottct 1740 

oagccagcat ttagtcaggt tcctgcttcc agtacttoaa ctttgccccc ttctactcac 1800 

tcaaagacat ctgctgtgtc ctctcaggca aattotcagc cccctgtaca ggtttctgtg 1860 

aagactcaag tatotgtaac agctgatatt ocacacctga aaacttcaac gttgoctcct 1920 

ttgcccctcc cacccttatt acctggaggt gatgacatgg atagtocaaa agaaactott 1980 

35 ccttcaaaac ctgtgaagaa agagaaggaa cagaggacac gtcacttact cacagacctt 2040 

cctctccctc cagagctccc tggtggagat ctgtctoccc oagactctoc agaaccaaag 2100 

goaatcacac cacctcagca accatataaa aagagaccaa aaatttgttg tcctcgttat 2160 

ggagaaagaa gacaaaoaga aagcgactgg gggaaacgct gtgtggacaa gtttgacatt 2220 

attgggatta ttggagaagg aacctatggc caagtatata aagccaggga caaagacaca 2280 

ggagaactag tggctctgaa gaaggtgaga ctagacaatg agaaagaggg cttcccaatc 2340 

40 acagccattc gtgaaatcaa aatcottcgt cagttaatcc accgaagtgt tgttaacatg 2400 

aaggaaattg tcacagataa aoaagatgca ctggatttca agaaggacaa aggtgcottt 2460 

taccttgtat ttgagtatat ggaccatgac ttaatgggac tgctagaatc tggtttggtg 2520 

cacttttctg aggaccatat caagtcgtto atgaaacagc taatggaagg attggaatac 2580 

tgtcacaaaa agaatttcct gcatcgggat attaagtgtt ctaacatttt gctgaataaa 2640 

45 agtgggcaaa tcaaactagc agattttgga cttgctcggc tctataactc tgaagagagt 2700 

cgcccttaca caaacaaagt cattactttg tggtaccgac ctccagaact actgctagga 2760 

gaggaacgtt acacaccagc catagatgtt tggagctgtg gatgtattct tggggaacta 2820 

ttcacaaaga agcctatttt tcaagccaat ctggaactgg ctcagctaga actgatcagc 2880 

cgactttgtg gtagcccttg tccagctgtg tggcctgatg ttatcaaact gccctacttc 2940 

aacaccatga aaccgaagaa gcaatatcga aggcgtctac gagaagaatt ctctttcatt 3000 

5 ccttctgcag cacttgattt attggaccac atgctgacac tagatcctag taagcggtgc 3060 

aoagctgaac agaccctaca gagcgacttc cttaaagatg tcgaactcag caaaatggct 3120 

cctccagacc tcccccactg gcaggattgc catgagttgt ggagtaagaa acggcgacgt 3180 

cagcgacaaa gtggtgttgt agtcgaagag ccacctccat ccaaaacttc tcgaaaagaa 3240 

actacctcag ggacaagtac tgagoctgtg aagaacagca gcccagcacc acctcagcct 3300 

55 gctcctggca aggtggagtc tggggotggg gatgcaatag gccttgctga catcacacaa 3360 
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cagotgaatc aaagtgaatt ggcagtgtta ttaaacctgc tgcagagcca aaccgacctg 3420 

agcatccctc aaatggcaca gctgcttaac atccactcca acccagagat goagoagcag 3480 

ctggaagccc fcgaaccaatc catcagtgcc ctgacggaag ctacttccca gcagcaggac 3540 

tcagagacca tggccccaga ggagtctttg aaggaagcac cctctgcccc agtgatcctg 3600 

3 cottcagcag aacagatgac ccttgaagot tcaagcacaa oagctgacat gcagaatata 3660 

ttggcagttc tcttgagtca gctgatgaaa acccaagagc cagcaggcag tctggaggaa 3720 

aacaacagtg acaagaacag tgggccacag gggccccgaa gaactcccac aatgccacag 3780 

gaggaggcag cagcatgtcc tcctcacatt cttccaccag agaagaggcc cootgagccc 3840 

cccggacctc caccgccgcc acctccaccc cctctggttg aaggcgatct ttccagcgcc 3900 

10 coccaggagt tgaacccagc ogtgacagcc gccttgctgc aaattttato ccagcctgaa 3960 

gcagagcctc ctggccacct gccacatgag caccaggcct tgagaccaat ggagtactcc 4020 

acccgacccc gtccaaacag gacttatgga aacactgatg ggcotgaaac agggttcagt 4080 

gccattgaca ctgatgaacg aaactctggt ccagccttga cagaatcott ggtccagacc 4X40 

ctggtgaaga acaggacctt ctcaggctct ctgagccaco ttggggagtc cagcagttac 4200 

15 oagggcacag ggtcagtgca gtttccaggg gaccaggacc tccgttttgc cagggtcccc 4260 

ttagcgttac acccggtggt cgggcaacca ttcatgaagg ctgagggaag cagcaattct 4320 

gtggtacatg oagagacoaa attgcaaaac tatggggaga tggggccagg aaooaotggg 4380 

gccagcagct caggagoagg ccttcactgg gggggcccaa ctcagtcttc tgcttatgga 4440 

aaactctatc gggggcctac aagagtccca ccaagagggg gaagagggag aggagttcct 4500 

tactaaccoa gagacttcag tgtcctgaaa gattcatttc ctatccatcc ttccatccag 4560 

20 ttctctgaat otttaatgaa atcatttgco agagcgaggt aatcatctgc atttggctac 4620 

tgcaaagctg tccgttgtat tcottgctoa cttgctacta gcaggcgact taggaaataa 4680 

tgatgttggc accagttccc cotggatggg ctatagccag aacatttact tcaactctac 4740 

ottagtagat acaagtagag aatatggaga ggatcattac attgaaaagt aaatgtttta 4800 

ttagttcatt gcctgcactt aotggtcgga agagagaaag aacagtttoa gtattgagat 4860 

25 ggctcaggag aggctctttg atttttaaag ttttggggtg gggggttgtg' tgtggtttct 4920 

ttottttgaa ttttaattta ggtgttttgg gtttttttcc tttaaagaga atagtgttca 4980 

caaaatttga gctgctcttt ggcttttgct ataagggaaa cagagtggoc tggctgattt 5040 

gaataaatgt ttctttcctc tccaccatct cacattttgo ttttaagtga acaattttto 5100 

oocattgagc atottgaaoa tacttttttt ccaaataaat tac tea teat taaagtttac 5160 

tccactttga caaaagatae gcccttctcc ctgcacataa agcaggttgt agaacgtggc 5220 

30 attcttgggc aagtaggtag actttaccca gtctctttcc ttttttgctg atgtgtgctc 5280 

totctctctc tttotctctc tctctctctc tetctctctc tatgtotgtc tcgottgatc 5340 

gototcgotg tttctctatc tttgaggoat ttgtttggaa aaaatcgttg agatgeccaa 5400 

gaaoetggga taattcttta ctttttttga aataaaggaa aggaaattoa aaaaaaaaaa 5460 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaa 5515 

35 

<210> 6 
<211> 6131 
<212> DHA 

40 

<213> Homo sapians 



45 <4Q0> 6 

gaattctagg cccagttctg tgtttcccct gtgtgttcct aggcaggtca gtttccctcc 60 

atgggcctct gtaagatgag gagttggaga ggtacattct caggctactt tcaactccca 120 

gecaagtgae tcaagagtcc caggcagcac cagcacccct atctccaagg cctcctgatg 180 

tgtgtctcta tttagaactt aatccaacct acccaacatc agatcagtgt cttaccagcc 240 

caaggtccct ggggagcetc ctagagggag agagccctgc ccacccagat tgagggtaaa 300 

5 ggcotccccg tgctcatttt tgtaccacca cagtgcttgg cacatggtag acatcaaaat 360 

gtgtgtgctg aaagtataat tgaagttgtg tatatatgtc agctagagtg tctggagggg 420 

cagaaatgtg ggtctaaaaa atacaaatgc tccaaatggg gtgtgggcaa gggtotgtct 480 

acaccaggct gtgattacct gctcacatac atgtgtctat ctgagtaggg gtatgttatc 540 

tatttttcta caccacaggg tgaggaacag gtatatgtgt gcatgtgtat gcatccgtgt 600 

55 gtgtgtgtat gtgtgtgtgc atgagtgtgt gtgtgtgtgt ccaaagccac ctcttcaacc 660 
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tgtgccattt gtatctgtgt ctggcccaat gagagtgttg aaaggtgagc cacaagataa 720 

aacagcaact tcctacctcc cttatcaaga cagctgtctg acctacctcc ccttggccac 780 

tcttgggatt actggggttg gcttcagtat tttcagattt ttcagaaggg gaggagaatg 840 

5 cttgagtctc atccaggaac ttaggcagtt ctcagcactg cctgctcotc ctccctcaaa 900 

taaccaagtc tgaagaccag gagagaaagc cgctggtgga ctggtcacat gtctggcagt 960 

gggaggagga gagtgagagg tttctaggta ggaatccaga cttagaccct cccctccacc 1020 

cccagatggg tggtgcacag go tea tc teg cggcccctco ccactccacc ctaacatgga 1080 

tacgccccca acaaccaagg aaagatctcc categgctga ctccacagat acacaoatgt 1140 

ccccacagac acacacacgc coatgeagag gcacagacat ccaggcacat ctttcccttt 1200 

10 ctctgtcttt cccttggttt gaatttcgtt tagecaoata tgttgtgtgt gcgtgagggt 1260 

gggtggggga ggggcagaca gggatgaggg atggcatggt gccaacatct acctatgggg 1320 

ctcgggccag ggacgcccot tacagccatc ctgggagggg gtctcagctg tccctttgtg 1380 

gecaagggga ccctcctggg gagtgggggc aagcacagag gtcctttctc cccaacccgg 1440 

ggtctggtcc ctgacccacc ttgggggcct gcaggggagg aaatggacag agcgggaccc 1500 

15 tgagggagca tagaattggc caccacgagc ccccagtgtc cagccttgcc accccattgt 1560 

tccogtgagg gggtctctat atacaggggg caactcctcc caccttcctc tcaatccctg 1620 

otttocotgq . gttgggcggg gaggggaggg eggcagaaat atttatttat ttcctttatt 1680 

tatttaattt tttttttttt tttttggagt agagagtgac agatggegge gggtcccggg 1740 

ggagcegget ctcccccagt geagaegcat gccaatcaoc gtctatcatg tgatagctgc 1800 

tgcaogtgac gtgccaagcc catatggoot ggcatagagg otggtacccc gectggtaga 1860 

20 gatgccacac tcgctccgcg gttcgcatgg cgctctgaag acgcoggcgc ccgccgcctt 1920 

gaggagcege tgcccccgct ccctgaagat gggggaacaa tgaaataagc gagaagatcc 1980 

ctcttctccc ccctctotct cttgccccct ccccccctcc cctcccctct ccccttgaot 2040 

cctctccgag gtaagttgto cgaaagggag cgagatctga cacgccggtt gggaggaggg 2100 

geggcagett cggccgacag gaggptcctc aaataoctcc ttcctgggat gatgccccca 2160 

25 tcattgggtg ggcateggag gggccccagg ttctototcc cttaggggct geagecoagg 2220 

gggctgeaga ggaggtgtct ctgcctgcga tgggctcggt ggggggggaa ggcaggatca 2280 

eggaggggga tatgegaaga ggccgagaog gaggacccct ccatggttgt cccaaaaagc 2340 

ctgccacctt tccccaccac cgaaaaaagg gaagcaaaca aacaaatttg gatttttccc 2400 

ooatcaatcc caaaatacaa cgagatctga agagecttgt gggagggagt cagcttgaag 2460 

ggggaagggg gtccctgacc gcagagggga eggactggge tegcttctot cagtctcctc 2520 

1 cccacgcccc getgettcag tcctcgocgc ccagagccgg ctccgggagc tggggacgea 2580 

teggctagag gagacgatcc tcccgcctct ggaattgggg gtgcgggggt gggggecgag 2640 

caaggggegg cgcgcagcca agttgcaaat tggattaggg agcgtggggg tgagagecac 2700 

gggaggggtg agggagctgg geegggggge ccgggccgcg agagegegga geggggcage 2760 

tgtccccacc ggcggccgac cagcctctct ccaccgccag gagagaaegg gctttcaggg 2820 

35 cgagcgcgcc gcctcccctg gcaaagatat ctggtcccta aaacccccac ccggtccctg 2880 

ccctgaccct gagaagaagc aggegegggg agcagccocc cattcaagcg aggggeggag 2940 

ccggggccca gcgccgggga gagggectgg gccgagatcc caggccggca geegggtagg 3000 

getgggcegg otctgggcgg ggcaggegge ggaggtgggc atccagggta gectaggcag 3060 

gagcccgcac gagacteggg ggtggaggag ggttgtgggg gggcgtcggt accccagcgc 3120 

40 gcccctcact ttgtgctgtc tgtctcccct tcccgoccgc ggggcgccct caggcaccat 3180 

gctgacccgc ctgttcagog agcccggcct tctctcggac gtgcccaagt tcgccagctg 3240 

gggegaegge gaagacgacg agecgaggag cgacaagggc gacgcgccgc caccgccacc 3300 

gcotgcgccc gggccagggg ctccggggcc agcccgggcg gccaagccag tccctctccg 3360 

tggagaagag gggaeggagg ccacgttggc cgaggtcaag gaggaaggcg agctgggggg 3420 

agaggaggag gaggaagagg aggaggaaga aggactggac gaggeggagg gcgagcggcc 3480 

45 caagaagege gggeccaaga agegcaagat gaecaaggeg cgcttggagc gctocaagct 3540 

teggeggcag aaggcgaacg cgcgggagcg caaccgcatg cacgacctga acgcagccct 3600 

ggacaacctg cgcaaggtgg tgccctgcta ctccaagacg cagaagctgt ccaagatcga 3660 

gacgctgcgc etagecaaga actatatctg ggcgctctcg gagatcctgc gctccggcaa 3720 

gcggccagac ctagtgtcct aegtgeagae tctgtgcaag ggtctgtcgo agcccaccac 3780 

50 caatctggtg gccggctgtc tgcagctcaa ctctcgcaac ttcctcacgg ageaaggege 3840 

cgacggtgcc ggccgcttcc aeggcteggg cggcccgttc gccatgcacc cctacccgta 3900 

cccgtgctcg cgcctggcgg gcgcacagtg ccaggcggcc ggcggcctgg gcggcggcgc 3960 

ggcgcacgcc ctgcggaccc aeggctactg cgccgcctac gagaegctgt atgeggegge 4020 

aggcggtggc ggcgcgagcc eggactacaa cagctccgag tacgagggee cgctcagccc 4080 

cccgctctgt ctcaatggca acttctcact caagcaggac tcctcgcccg accacgagaa 4140 

55 aagctaccac tactctatgc actactcggc gctgcccggt tcgcgccacg gccacgggct 4200 
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agtcttcggc tcgtcggctg tgcgcggggg cgtccactcg gagaatctct tgtcttacga 4260 

tatgcacctt caccacgacc ggggccccat gtacgaggag ctcaatgcgt tttttcataa 4320 

ctgagacttc gcgccggctc cottcttttt cttttgcctt tgcocgcccc cctgtccoca 4380 

5 gcccccagca gcgcagggta cacccccatc ctaccccggc gccgggcgcg gggagcgggc 4440 

caocggtcct gccgctctcc tggggcagcg cagtcctgtt acctgtgggt ggcctgtcco 4500 

aggggcctcg cttcccocag gggactcgcc ttctotctcc ccaaggggtt ccctcotcct 4560 

ctctcccaag gagtgcttct ccagggacct ctctccgggg gctccctgga ggcacccctc 4620 

ccccattccc aatatcttcg ctgaggtttc ctoctocccc tcctccctgc aggcccaagg 4680 

cgttggtaag ggggcagctg agcaatggaa cgcgtttccc cctotoatta ttattttaaa 4740 

10 aacagacacc cagctgccga ggcaaaaagg agccaggcgc tccctctttc ttgaagaggg 4800 

tagtattttg ggcgccggag cocgggcotg gaacgccctc acccgcaacc tccagtctcc 4860 

gcgttttgcg attttaattt tggcgggagg ggaagtggat tgagaggaaa gagagaggcc 4920 

aagacaattt gtaactagaa tocgtttttc ccttttcctt tttttaaaca aacaaacata 4980 

oaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aagctaagag gogacggaag ccgaacgcag 5040 

agtooggatc ggagagaaaa cgcagtaagg aottttagaa gcaataaaag gcaaaaaaaa 5100 

caaaaaacaa aaaaacaaac aaaaaaaaac cactactaco aataatcaaa ga cacaaa ta 5160 

tctatgcaag gaggctocac tgagcctcgc ggcccggcoc ggccccggga tgooocgccc 5220 

ggcctgoggg ccgccccgco cgagcgcgga tctgtgcaot ttggtgaagt gggggcccgc 5280 

gccgccccct ccccctcccc aggttcttac aatcagtgac tcggagattt ggggccccag 5340 

tgccactgcc otcocccgcc ccgtccccgt tgtgagtoat gctgtttttt aaaaacctgt 5400 

ttccaaattt gtatggaatg gcaaactgtt ggggggtcgg tttggggagg gagggtttgc 5460 

atgaaagaca cacgcacacc acaccgcacg cacaagcagg cccggcgccg gcgtccgggg 5520 

ggcagaagga ggtgagctcg coggctcoto ctcccogcgg ccattctgtc ocotcctggg 5580 

9t9*99Q9tg gggatggaga cctgggggca gccccacccc tgcccggact gtgcctcggt 5640 

gggtgacacc tggcgatttc cggtgtctgg agagagtatt ttttggtcca aggagtcctc 5700 

25 ttggotttag ctggtgggtg ggcggggaga ggtctgaggg ctcctactgg aggttccccc 5760 

aaaaaggggc aaaaggagao catctgcoca ccggaggcag gggatcaggc atccaaatac 5820 

acgatgcaaa aatgcaatcc cacaggogac acacccacac actcaccaac acacacgcaa 5880 

ttttaccttc ctcttgtagc gaagatgaaa ctcccgtcgg acacocgaag tgcattgcgt 5940 

gtttctgttc agtttaatga cgattaataa atatttatgt aaatgagatg caaagccgga 6000 

ccggtttctc acggtggcot catttcattg aggggggaga gaaggtttga gctggggctg 6060 

gggtgatgaa ggcagagtgt caagtgactg tgcagaggoo aaacagaggg aottcccagc 6120 

aaaaagcact g 6131 



15 



20 



30 



<210> 7 
35 <211> 2020 

<212> DHA 
<213> Homo sapiens 

40 

<400> 7 

gctactgagg ccgcggagcc ggactgcggt tggggcggga agagccgggg ccgtggctga 60 
catggagoag ccctgctgct gaggccgcgc octccccgcc ctgaggtggg ggcccaccag 120 

45 gatgagcaag ctgcccaggg agctgaoccg agacttggag cgcagcctgc ctgocgtggc 180 

ctccctgggc tcctcactgt cccacagcca gagcctctcc tcgcacctcc ttccgccgcc 240 
tgagaagcga agggccatct ctgatgrtccg ccgcaccttc tgtctcttcg toaocttcga 
cctgctcttc atctccctgc tctggatcat cgaactgaat accaacacag gcatccgtaa 
gaacttggag caggagatca tccagtacaa ctttaaaact tccttcttcg acatctttgt 

50 cctggccttc ttccgcttct ctggactgct cctaggctat gccgtgotgc agctccggca 480 

ctggtgggtg attgcggtca cgacgctggt gtccagtgca ttcctcattg tcaaggtcat 540 
cctctctgag ctgctcagca aaggggoatt tggctacctg ctccccatcg tctcttttgt 600 
cctcgcctgg ttggagacct ggttccttga cttcaaagtc ctaccccagg aagctgaaga 660 
ggagogatgg tatcttgccg occaggttgc tgttgcccgt ggacccctgc tgttctcogg 720 
tgctctgtcc gagggacagt tctattcacc cccagaatcc tttgcagggt ctgaoaatga 780 

55 atcagatgaa gaagttgctg ggaagaaaag tttctctgct caggagcggg agtacatccg 840 



3001 

360 

420 
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ccaggggaag gaggccacgg cagtggtgga 
gtttgagaag aataatgaat atggggacac 
caagacgttt atcctgaaga ccttcctgcc 

5 gatcotgcag cccgagagga tggtgctgtg 

goagcgagtg gaagacaaca ccctcatctc 
cgtggtctcc ccaagggact tcgtgaatgt 
cttgtcatca gggatcgcca cctcacacag 
gggagagaat ggccctgggg gcttcatcgt 
cacctttgtc tggattctta atacagatct 

10 ccagagcctc gcggccacca tgtttgaatt 

gotgggggcc cgggogtgac tgtgccccct 
ccacttccag agooagaaag ggtgccagtt 
ccaggotgto accctccacc gagccacgoa 
tggggtggag cactggactc oggggcccoa 

15 gatgtttaca tggcgccctg cctcctggag 

gcagggtctg ggctgggcac ctgacttggc 
ggcagcctgt cacccgtgtg aagatgaagg 
ttttttagga ttattgaaag agtotgggac 
tgggctgctg gccatgaatc tctgcctctc 
20 tgggggacct ttgtattaag ccaattaaaa 

<210> 8 

<211> 1730 

25 

<212> DNA 

<213> Homo sapiens 



ccagatcttg gcccaggaag agaactggaa 900 

cgtgtacacc attgaagttc cctttcacgg 960 

ctgtcctgcg gage teg tgt accaggaggt 1020 

gaacaagaca gtgactgect gccagatcct 1080 

etatgaegtg tetgeagggg otgogggegg 1140 

ccggcgcatt gageggegea gggacogata 1200 

tgccaagcca ccgacgcaca aatatgtccg 1260 

gctcaagtcg gecagtaaco cccgtgtttg. 1320 

caagggoege ctgccccggt acctcatcca 1380 

tgcotttcac ctgcgacagc geatcagega 1440 

ccoaccctgc gggccagggt cctgtcgoca 1500 

gggctcgcac tgcccacatg ggacctggcc 1560 

gtgcctggag ttgactgact gagcaggctg 1620 

ctggctggag gaagtggggt ctggcctgtt 1680 

gaccagattg ctctgcccca ccttgacagg 1740 

tggggaggac cagggccctg ggoagggcag 1800 

ggctcttcat ctgcctgcgo tetegteggt 1860 

ccttgttggg gagtgggtgg caggtggggg 1920 

ccaggotgtc cccctcctcc cagggcotcc 1960 

acatgaattt 2020 



40 



cacccttcct gcccccaggt cccagcagcc caggggagco ccccacccag cctgtgccca 
45 gagagcaaca gctcccagga gctcactgcc cctcccatct ccccagctgc tccctgcatg 

aggaggacac ccagagacat gagacctacc aocagoaggg gcagtgocag gtgctggtgc 



60 
120 
180 



30 

<400> 8 

gtggtgaggg tgactgggga otaggcacta ggcctttggt gcaggcgcct gaggacktgg 
ttgcactctc ccttctgggg atatgecett gagcccaggc agaggagagc acagcccagg 
gcaggacctg gcagccctgg tacagagccc agagggggoa tcagttcotg otggtcctgc 
35 tctgtttaca gaeaasctge tgtcctccct gcaaagggga gtgggtgggg cagagggcaa 240 

ktgccagggg ggcaoaaggc tgggcatgtg gctggcatga gacggtgtct gagtaatgtc 300 
aggcacctgg aggcattgac cccaggacct tggacoccag acctctgacc gtggggcagc 360 
cagcgtccag gtaccccaac ccctgccctg ggtccggcgt ccccccatta gtgagtcttg 420 
gctctactta tagoatctga caccagaggg gecgaaaata gcccctggag aagggggagg 480 
agggggctat ttaaagggoo tgggagggga gagagaatga ggagtgatca tggctacctc 540 
agagctgagc tgcgaggtgt eggaggagaa otgtgagcgc egggaggect tctgggcaga 
atggaaggat ctgacactgt ccacacggcc cgaggagggg tgagtgtggg totgetagag 
tccctgcctc tgotcoccca gagcaccctc actgagecat gaggecagag oatgaagece 
tggagaaatt tctgggggtg ggggcaggaa gaatgcccca tggggagagc aaaggggaac 780 

840 
900 
960 



600 
660 
720 



agcgctcgcc ctggctgatg atgcggatgg gcatcctcgg ccgtgggotg caggagtacc 1020 

agctgcccta ecagegggta ctgccgctgc ccatcttcac ccctgccaag atgggcgcca 1080 

ccaaggagga gcgtgaggao acccccatoc agcttcagga gctgctggcg ctggagacag 1140 

50 ccctgggtgg ccagtgtgtg gaccgccagg aggtggctga gatcacaaag cagctgoccc 1200 

ctgtggtgcc tgtcagcaag cccggtgcmc ttcgtcgctc cctgtcccgc tccatgrtccc 1260 

aggaagcaca gagaggctga gagggactgt gacttgggct ccgctgtgcc cgccccctgg 1320 

gctgggccct tcctggctag gacctgtgga ggggcagctc gctggcacat ggctgctttg 1380 

tagtttgece agagttgggg gctaggggag gggggageca gaggecagga tgcctgagcc 1440 

ccctgagttc ccaaagggag ggtggcagag acagtgggca ctaagggtgg agagttgggg 1500 

55 gccagcacag ctgaggaccc tcagccccag gagaagggac aaaaggtaot ggtgagggca 1560 
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agaggtgcct gggaggagtg gccctgatcc aggaaaatgt gaggggaatc tggaacgctc 1620 
taggcagaag aagctgggag ggagggggag gtgaaaaggg cagaggcaag gatggtgggg 1680 
cccccagcac cctctgttag tgccgcaata aatgctcaat catgtgccag 1730 



<210> 9 
<211> 3799 



<212> DNA 

10 



<213> Homo sapiens 



15 <400> 9 

ctggcaotgg gtggtaacca gcaagccagc tggcatccgc atccagggtt tgtttcaatg 60 

atgtctcgtg gagaatatgg aggggotggt gccaggactg tccttggctt tgcctcgggg 120 

tgtgaacggg gtcagtgacc totaaaacta acotgcctat cagttctgaa tcaagacaga 180 

atoaatcctc agctg^gtct cgctccacac ecactgocct ggaagccagg gaaggttgga 240 

ggtgctaggg ggtcaggctc ccctctgtga cccotgoagc tgttgtggtg actcatgtcc 300 

20 caacctagct gcctctccca aggagaattt cccctgggac aagggggagg gaatggcatg 360 

gaggaggccc acatcaagcg gggccaggaa cccaoggtgg caggagctgg gctggtgacc 420 

tacccagggc agaagggccc gggactcatc cagaggggaa ggaaggggtc ttcaggaaga 480 

ccacggagat gccacaggca gaattggott cccatctggg agataggtgg ggagaccotg 540 

gcattttgac agccagaacc tggggtgctg agcagaatct tcatgoctgg cctggccgcc 600 

25 ttcggaggga agctggaggg ttgggtgcga gaggagtggg gtcagagccc ctacatcogc 660 

aggaccccaa atcggctggg ccccaaggcc cggactgcgc tccccggtgg ccccggcggo 720 

cctccgcgaa tgcgtcctgc caatcccctg cccaagccct ctgocctcac ccgggtccgg 780 

cgccgccccc gaagtggcgg gaacaaoccg aacccgaacc ttctgtcctc gggagccccc 840 

agataagcgg ctgggaaccc gcggggcccg caggggaggo ccggctgttc cgcccgctaa 900 

gtgcattagc acagctcacc tcccctatcg cgcctgccat cggacgggca gtgccgcgcc 960 

30 otgctctggg gcccccggag cgaccacagc ggaggccgga acggactgtc ctttctgggg 1020 

oggggtgggg agggggtgtc gctggagggo ccggtggcat agcaacggac gagagaggcc 1080 

tggaggaggg gcggggaggg ggagttgtgt ggcagttcta agggaagggt gggtgctggg 1140 

acgggtgtcc gggagggagg ggagcotggc ggggtctggg gcctcgtcgc ggagggcgct 1200 

gcgaggggga aactggggaa agggcctaat tccccagtct ccacctcgaa tcaggaaaga 1260 

35 gaaggggcgg gatgctgggc aaaagaggtg aatggctgcg gggggctgga gaagagagat 1320 

gggaggggcc ggccggcggg ggtgaggggg tctaaagatt gtgggggtga ggaactgagg 1380 

gtggggggcg cccagaggcg ggactcgggg cggggcaggc gaggcggagg gcgagggctg 1440 

cgggagcaag taoggagccg ggggtgtggg ggacgattgc cgctgcagcc gccgccccac 1500 

tcacctccgg tgtgtctgca gcccggacaa taagggagat ggatgaatgg gtggggagga 1560 

40 tgcggcgcac atggccccgg gcggotcggc ggtcagctgc cgcccccaca gcggaccggt 1620 

cggggcgggg gtcgggcggt agaaaaaagg gccgcgaggc gagcggggca ctgggcggac 1680 

cgcggcggoa gcatgagcgg cgcagaocgt agccccaatg cgggogcagc ccctgactcg 1740 

gccccgggcc aggcggcggt ggcttcggcc taccagcgct tcgagccgcg cgcctaccto 1800 

cgcaacaact acgcgccccc tcgcggggac ctgtgcaacc cgaacggcgt cgggccgtgg 1860 

aagctgcgct gcttggcgca gaccttcgcc accggtgagc gggggaaact gaggoacgag 1920 

45 ggacaagagg tcgtcgggga gtgaaagcag gcgcagggaa ataaaaagaa ggaaagggag 1980 

acagaccagg cgcctaacag atggggacca agaaacaaga gatagctgag aggtgoaaac 2040 

agaagagaaa aaggagcaac atcccttagg agaggggcag aggagagaga ggtggagaga 2100 

gggggcggag agtgctcaga attgagagct aaggtggggg atgcaggaca gactgaggtg 2160 

gagatgcata ggaggaaatg gaggcagatg tgggacaggg gtgagaaact ccaggattto 2220 

50 ctcgctgagc ctggctggta ggtatagttg ttttctttct ttttctttat tttattttca 2280 



tttatttact tatttttatt ttttatttgt tttgagacgg agtttcgctc ttgttgccca 2340 



2400 



ggctggagta caatggcgcc atctcggctc actgcaaoct ccgcctcccc gggttcaago 

gattctcttg cctcagcttc cctagtagct gggattacag gcatgcgccc ccatgcctgg 2460 

ctaatttatt tgtattttta gtagagacgg gacttctcca tgttggtcag gctggtctcg 2520 

aactcccaac cttaggatcc acccaccccg gcctcccaaa gtgctgggat tacaggtgtg 2580 

55 agccactgcg cccggccagt aggtatagtc ttctagatgt gaaacctgag tctcagagcg 2640 
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gtgaagttcc cttccgaagg gcagcccatg ttggagctgg gttcagtcta actctggggc 2700 

caatgctttt tccagatgga gacacatttg cagaggagaa ggaagaacta gagagaggca 2760 

gggagatgca ggggagggaa gggtaaggag gcaggggctg cctgggctgg ctggcaccag 2820 

5 gaccctcttc ctctgccctg cccaggtgaa gtgtccggao gcaccctcat cgacattggt 2880 

tcaggcccca ccgtgtacca gctgctcagt gcctgcagcc actttgagga catcaocatg 2940 

acagatttcc tggaggtcaa ccgcoaggag ctggggcgct ggctgcagga ggagccgggg 3000 

gccttcaact ggagcatgta cagccaacat gcctgcctca ttgagggcaa ggggtaagga 3060 

ctggggggtg agggttgggg aggaggcttc ccatagagtg gctggttggg gcaacagagg 3120 

cctgagcgta gaacagcctt gagccctgcc ttgtgcctcc tgcacaggga atgctggcag 3180 

10 gataaggagc gccagctgog agccagggtg aaacgggtcc tgcccatcga cgtgcaccag 3240 

ccccagcccc tgggtgctgg gagcccagct cccctgcctg ctgacgccct ggtctctgcc 3300 

ttctgcttgg aggctgtgag cccagatctt gccagctttc agcgggccct ggaccacatc 3360 

accacgctgc tgaggcctgg ggggcacato otccfccatcg gggcoctgga ggagtcgtgg 3420 

tacctggctg gggaggccag gctgacggtg gtgccagtgt ctgaggagga ggtgagggag 3480 

15 gccctggtgc gtagtggcta caaggtcogg gacctccgca octatatcat goctgcccac 3540 

cttcagacag gcgtagatga tgtcaagggc gtcttcttcg cctgggotca gaaggttggg 3600 

otgtgagggo tgtacctggt gccctgtggc ccccacocac ctggattccc tgttctttga 3660 

agtggcacct aataaagaaa taataocctg ccgctgcggt cagtgctgtg tgtggctctc 3720 

ctgggaagca gcaagggccc agagatctga gtgtccgggt aggggagaca ttcacoctag 3780 

2o gctttttttc cagaagctt 3799 

<210> 10 
<211> 4530 
25 <212> DNA 

<213> Homo sapiens 



<400> 10 

aattotcgag ctcgtcgacc ggtcgacgag ctcgagggtc gacgagctcg agggcgcgcg 60 

cccggccccc acccctcgca goaccocgcg ccocgcgccc tcccagccgg gtccagccgg 120 

agccatgggg ccggagccgc agtgagcacc atggagctgg cggccttgtg ccgctggggg 180 

ctcotcctcg ccctcttgcc ccccggagcc gcgagcacoc aagtgtgcac cggcaoagac 240 

35 atgaagctgc ggctccctgo cagtcccgag aoocacctgg acatgctccg ccacctctac 300 

cagggctgcc aggtggtgca gggaaacctg gaactcacct acctgcccac caatgccagc 360 

ctgtccttcc tgcaggatat ccaggaggtg cagggctacg tgc teat ego tcacaaccaa 420 

gtgaggcagg tcccactgoa gaggctgegg attgtgcgag gcacccagct ctttgaggac 480 

aactatgccc tggccgtgct agacaatgga gacccgctga acaataccac ccctgtcaca 540 

40 ggggcctccc caggaggect gcgggagctg cagcttcgaa gcctcacaga gaj:cttgaaa 600 

ggaggggtct tgatccagcg gaacccccag ctctgctacc aggacacgat tttgtggaag 660 

gacatcttcc acaagaaaaa ccagctggct ctcacactga tagacaccaa ccgctctcgg 720 

gcctgccacc cctgttctcc gatgtgtaag ggctcccgct gctggggaga gagttctgag 780 

gattgtcaga gcctgacgcg cactgtctgt gccggtggct gtgcccgctg caaggggeca 840 

ctgcccactg actgctgcca tgagcagtgt gctgccggct gcacgggccc caagcactct 900 

45 gaetgectgg cctgcctcca cttcaaccac agtggcatct gtgagatgea ctgcccagcc 960 

ctggtcacct acaacacaga caegtttgag tccatgccca atcocgaggg ceggtataca 1020 

ttcggcgcca gctgtgtgac tgcctgtccc tacaactacc tttctacgga cgtgggatcc 1080 

tgcaccctcg tctgccccct gcacaaccaa gaggtgacag cagaggatgg aacacagegg 1140 

tgtgagaagt gcagcaagcc ctgtgcccga gtgtgctatg gtctgggcat ggagcacttg 1200 

50 cgagaggtga gggcagttac cagtgccaat atccaggagt ttgctggctg caagaagatc 1260 

tttgggagcc tggcatttct geeggagage tttgatgggg acccagcctc caacactgcc 1320 

ccgctccagc cagagcagct ccaagtgttt gagactctgg aagagatcac aggttaccta 1380 

tacatctcag catggccgga cagcctgcct gacctcagcg tcttccagaa ectgeaagta 1440 

ateeggggae gaattctgea caatggcgcc tactegctga ccctgcaagg gctgggcatc 1500 

agctggctgg ggctgcgctc actgagggaa ctgggcagtg gactggccct catccaccat 1560 

55 aacacccacc tetgettegt geacaeggtg ccctgggacc agctctttcg gaacccgcac 1620 
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f5 



caagctctgo tccacacfcgc caaccggcca gaggacgagt gtgtgggcga gggcctggcc 1680 

tgccaccagc tgtgcgcccg agggcactgc tggggtccag ggcccaccca gtgtgtcaac 1740 

tgcagccagt tccttcgggg ccaggagtgc gtggaggaat gccgagtact gcaggggctc 1800 

5 cccagggagt atgtgaatgc caggcactgt ttgccgtgcc accctgagtg tcagccccag 1860 

aatggctcag tgacctgttt tggaccggag gctgacoagt gtgtggcctg tgcccaotat 1920 

aaggaccctc ccttctgcgt ggoccgctgc cccagcggtg tgaaaoctga cctctcctac 1980 

atgcccatct ggaagtttcc agatgaggag ggcgcatgcc agccttgccc catcaactgc 2040 

acccactcct gtgtggacct: ggatgacaag ggctgccccg ccgagcagag agccagccct 2100 

ctgacgtcca tcgtctctgc ggtggttggc attctgctgg tcgtggtctt gggggtggtc 2160 

10 tttgggatcc tcatcaagcg acggcagcag aagatccgga agtacacgat gcggagactg 2220 

ctgcaggaaa cggagctggt ggagccgctg acacctagcg gagcgatgco caaccaggcg 2280 

cagatgcgga tcctgaaaga gacggagctg aggaaggtga aggtgcttgg atctggcgct 2340 

tttggcacag tctacaaggg catctggatc octgatgggg agaatgtgaa aattccagtg 2400 

gccatcaaag tgttgaggga aaacacatcc cocaaagcca acaaagaaat cttagacgaa 2460 

gcataagtga tggctggtgt gggctcccca tatgtctcca gccttctggg catctgcctg 2520 

acatccacgg tgcagctggt gacacagctt atgccctatg gctgcctott agaccatgtc 2580 

cgggaaaacc gcggacgcct gggctcccag gacotgotga actggtgtat gcagattgcc 2640 

aaggggatga gctaoctgga ggatgtgcgg ctcgtacaca gggacttggc cgctcggaac 2700 

gtgctggtca agagtcccaa ccatgtcaaa attacagact tcgggctggc tcggctgctg 2760 

gacattgacg agacagagta coatgcagat gggggcaagg tgcccatcaa gtggatggcg 2820 

20 ctggagtcca ttctccgacg gcggttcacc cacoagagtg atgtgtggag ttatggtgtg 2880 

actgtgtggg agctgatgaa ttttggggcc aaaccttacg atgggatccc agcccgggag 2940 

atccctgacc tgctggaaaa gggggagogg ctgccccagc cccccatctg caccattgat 3000 

gtctacatga tcatggtcaa atgttggatg attgactctg aatgtaggcc aagattcogg 3060 

gagttggtgt ctgaattctc ccgcatggcc agggaccccc agcgctttgt ggtoatccag 3120 

25 aatgaggact tgggcccagc cagtcccttg gacagcacct totaccgcto actgctggag 3180 

gacgatgaca tgggggaoot ggtggatgct gaggagtatc tggtaccoca gcagggcttc 3240 

ttctgtccag accctgcccc gggcgctggg ggcatggtcc accacaggca ccgcagctca 3300 

totaccagga gtggcggtgg ggacctgaca ctagggctgg agccctctga agaggaggcc 3360 

cccaggtctc cactggcacc ctccgaaggg gctggctccg atgtatttga tggtgacctg 3420 

ggaatggggg cagccaaggg gctgcaaago ctcoccacac atgaccccag ccctctacag 3480 

cggtacagtg aggaccccac agtaaocatg ocototgaga ctgatggcta cgttgocccc 3540 

otgacctgca gcccccagoo tgaatatgtg aaccagccag atgttcggcc ccagccccct 3600 

tcgccccgag agggccctct goctgotgoo cgacctgctg gtgccactot ggaaagggcc 3660 

aagactctct ccccagggaa gaatggggtc gtcaaagacg tttttgcott tgggggtgcc 3720 

gtggagaacc ccgagtactt gaoaccccag ggaggagctg cccctcagcc ccaccotcct 3780 

35 cctgccttca gcccagcctt cgaoaaooto tattactggg accaggaccc accagagcgg 3840 

ggggctccac ccagcacctt caaagggaca ootacggcag agaacccaga gtacctgggt 3900 

ctggacgtgc cagtgtgaac cagaaggcca agtccgcaga agccctgatg tgtootcagg 3960 

gagcagggaa ggcctgaatt ctgctggaat caagaggtgg gagggccota cgaaoacttc 4020 

caggggaacc tgccatgcca ggaacctgtc ctaaggaaco ttcottcctg cttgagttcc 4080 

cagatggctg gaaggggtcc agcctcgttg gaagaggaaa agcactgggg agtetttgtg 4140 

gattctgagg ccctgccoaa tgagactcta gggtccagtg gatgccacag cccagottgg 4200 

ccctttcctt ccagatcctg ggtactgaaa gccttaggga agctggcctg agaggggaag 4260 

cggccctaag ggagtgtcta agaacaaaag cgacccattc agagactgtc cctgaaacct 4320 

agtactgccc cccatgagga aggaacagca atggtgtcag tatccaggct ttgtacagag 4380 

tgcttttctg tttagttttt actttttttg ttttgttttt ttaaagacga aataaagacc 4440 

45 caggggagaa tgggtgttgt atggggaggc aagtgtgggg ggtccttctc cacacccact 4500 

ttgtccattt gcaaatatat tttggaaaac 4530 

<210> 11 

<211> 2205 

<212> DNA 

<213> Homo sapiens 

55 



30 



40 



50 
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<400> 11 

5 cacagggotc coccccgcct ctgacttctc tgtccgaagt cgggacaccc tcctaccacc 60 

tgtagagaag cgggagtgga tctgaaataa aatccaggaa tctgggggtt cctagacgga 120 

gccagacttc ggaacgggtg tec tgc tact cctgctgggg ctcctccagg acaagggcac , 180 

acaactggtt ccgttaagcc cctctctcgc tcagacgcca tggagctgga tctgtctcca 240 

cctcatctta gcagctctcc ggaagacctt tggccagccc ctgggacccc tcctgggact 300 

10 ccccggcccc ctgatacccc tetgectgag gaggtaaaga ggtcccagcc tctcctcatc 360 

ccaacoaccg gcaggaaact tcgagaggag gagaggegtg ccacctccct cocctctatc 420 

cccaacccct tccctgagct otgeagtect ccctcacaga gcacaattct cgggggcccc 480 

tccagtgcaa gggggctget cccccgcgat gccagccgcc cccatgtagt aaaggtgtac 540 

agtgaggatg gggcctgcag gtctgtggag gtggcagcag gtgccacagc tcgccacgtg 600 

15 tgtgaaatgc tggtgoagcg agctcacgcc ttgagcgacg agacctgggg gctggtggag 660 

tgccaccccc acctagcact ggagcggggt ttggaggacc aegagtcegt ggtggaagtg 720 

caggctgcct ggcccgtggg eggagatage cgcttcgtct teeggaaaaa ottcgecaag 780 

tacgaactgt tcaagagctc cccacactcc ctgttcccag aaaaaatggt ctccagctgt 840 

ctcgatgcac acactggtat atcccatgaa gacotcatcc agaacttcct gaatgctggc 900 

agctttcctg agatccaggg etttctgeag ctgcggggtt caggaeggaa gctttggaaa 960 

20 egctttttet gtttcttgcg ccgatctggc ctctattact ccacoaaggg cacctctaag 1020 

gatocgaggc acotgeagta cgtggcagat gtgaacgagt ccaacgtgta cgtggtgacg 1080 

cagggccgca agetctaegg gatgcccact gaetteggtt tctgtgtcaa gcccaacaag 1140 

cttcgaaatg gacacaaggg getteggate ttctgcagtg aagatgagca gagccgcacc 1200 

tgctggctgg ctgccttccg cctcttcaag tacggggtgc agctgtacaa gaattaccag 1260 

25 caggcacagt ctcgccatct gcatccatct tgtttgggct ccccaccctt gagaagtgee 1320 

tcagataata ccctggtggc catggacttc tctggccatg ctgggcgtgt cattgagaac 1380 

ccccgggagg ctctgagtgt ggccctggag gaggeccagg cctggaggaa gaagacaaac 1440 

caccgcctca gcctgcccat gccagcctcc ggcacgagcc teagtgeage catccaccgc 1500 

acccaactct ggttccacgg gcgcatttcc cgtgaggaga gccagcggct tattggacag 1560 

cagggcttgg tagaeggect gttcctggtc egggagagtc agcggaaccc ccagggcttt 1620 

30 gtcctctctt tgtgccacct gcagaaagtg aagcattatc tcatcctgcc gagagaggag 1680 

gagggtcgee tgtacttcag catggatgat ggccagaccc gcttcactga ectgotgeag 1740 

ctcgtggagt tccaccagct gaaccgcggc atcctgccgt gettgetgeg coat tgc tgc 1800 

acgcgggtgg ccotctgacc aggccgtgga ctggctcatg cctcagcccg ccttcaggct 1860 

gcccgccgcc cctccaccca tocagtggac tetggggege ggccacaggg gaegggatga 1920 

35 ggagegggag ggttccgcca atccagtttt ctcctctgct tctttgcctc cctcagatag 1980 

aaaacagccc ccactccagt ccactcctga cccctctcct caagggaagg ccttgggtgg 2040 

ccccctctoc ttctcctagc tctggaggtg ctgetctagg gcagggaatt atgggagaag 2100 

tgggggcagc ecaggeggtt tcacgcccca cactttgtao agaccgagag gecagttgat 2160 

ctgotctgtt ttatactagt gacaataaag attatttttt gatac 2205 

'40 

<210> 12 
<211> 2177 
<212> DHA 

45 

<213> Homo sapiens 



50 <400> 12 



60 
120 



gaattcgegg ccgctggttt gcagctgctc cgtcatcgtg cggcccgacg ctatctcgcg 
ctcgtgtgca ggcccggctc ggctcctggt ccccggtgcg agggttaacg cgaggccccg 

gcctcggtcc ceggactagg ccgtgacccc gggtgccatg aagcaggagg gctcggcgcg 180 

gcgccgcggc geggacaagg cgaaaccgcc gcccggcgga ggagaacaag aacccccacc 240 

gccgccggcc ccccaggatg tggagatgaa agaggaggca gcgacgggtg gegggtcaac 300 

55 gggggaggca gaeggcaaga cggcggcggc ageggttgag cactcccagc gagagctgga 360 
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cacagtcacc ttggaggaca tcaaggagca cgtgaaacag ctagagaaag cggtttcagg 420 

caaggagccg agattcgtgc tgcgggccct gcggatgctg ccttccacat cacgccgcct 480 

caaccactat gttctgtata aggctgtgca gggcttcttc acttcaaata atgccactcg 540 

5 agactttttg ctccccttcc tggaagagcc catggacaca gaggctgatt tacagttccg 600 

tccccgcacg ggaaaagctg cgtcgacacc cctcctgcct gaagtggaag cctatctcca 660 

actccfccgtg gtcatcttca tgatgaacag caagcgctac aaagaggcac agaagatctc 720 

tgatgatctg atgcagaaga tcagtaotca gaaccgccgg gccctagacc ttgtagccgc 780 

aaagtgttac tattatcacg cccgggtcta tgagttcctg gacaagctgg atgtggtgcg 840 

cagcttcttg catgctcggc tccggacagc tacgottogg oatgacgcag acgggcaggc 900 

10 caccctgttg aacctcctgc tgcggaatta cctacactac agcttgtacg accaggctga 960 

gaagctggtg tccaagtctg tgttcccaga gcaggccaac aacaatgagt gggccaggta 1020 

cotctactac acagggcgaa tcaaagccat ccagotggag tactcagagg cccggagaac 1080 

gatgaccaac gcccttogca aggcccctca gcacaoagct gtcggcttca aacagacggt 1140 

gcacaagctt ctcatcgtgg tggagctgtt gctgggggag atccctgacc ggctgcagtt 1200 

1$ ocgocagccc tccctcaagc gctcactcat gccctattto cttctgactc aagotgtcag 1260 

gacaggaaac ctagccaagt tcaaccaggt ootggatcag tttggggaga agtttcaagc 1320 

agatgggacc tacaccctaa ttatcoggct gcggcacaac gtgattaaga caggtgtacg 1360 

catgatcagc ctctcctatt ccagaatotc cttggctgac atcgcccaga agotgcagtt 1440 

ggatagccoc gaagatgcag agttcattgt tgccaaggcc atccgggatg gtgtcattga 1500 

ggccagcatc aaccacgaga agggctatgt ccaatccaag gagatgattg acatctatto 1560 

20 cacccgagag ccccagctag cottccacca gcgcatctco ttctgcctag atatccaoaa 1620 

catgtctgtc aaggccatga ggtttcctcc caaatcgtac aacaaggact tggagtctgc 1680 

agaggaacgg cgtgagcgag aacagcagga cttggagttt gccaaggaga tggcagaaga 1740 

tgatgatgac agcttccctt gagctggggg gctggggagg ggtaggggga atggggacag 1800 

gctctttccc ccttgggggt cccctgccaa gggcactgto cccattttcc cacacaoago 1860 

25 tcatatgctg cattcgtgca gggggtgggg gtgctgggag ccagccaccc tgacctccca 1920 

cagggctcct coccagccgg tgacttactg tacagcaggc aggagggtgg gcaggcaacc 1980 

tccccgggca gggtcctggc cagcagtgtg ggagcaggag gggaaggata gttatgtgta 2040 

ctcctttagg gagtggggga ctagaactgg gatgtcttgg cttgtatgtt ttttgaagct 2100 

tcgattatga tttttaaaca ataaaaagtt otcccaaaaa aaaaaaaaaa aaaaaaaaaa 2160 

aaagcggccg cgaattc 2177 

30 

<210> 13 
<211> 2960 
35 <212> DNA 

<213> Homo sapiens 



40 <400> 13 

ctgccgcttc caggcgtcta tcagcggctc agcctttgtt cagctgttct gttcaaacac 60 

tctggggcca ttcaggcctg ggtggggcag cgggaggaag ggagtttgag gggggcaagg 120 

cgacgtcaaa ggaggatcag agattccaca atttcacaaa actttcgcaa acagcttttt 180 

gttccaaccc ccctgcattg tcttggacac oaaatttgca taaatcctgg gaagttatta 240 

45 ctaagcctta gtcgtggccc caggtaattt cotcccaggc ctccatgggg ttatgtataa 300 

agggccccct agagctgggc cccaaaacag , cccggagcot gcagcccagc cccacccaga 360 

cccatggctg gacctgccac ccagagcccc atgaagctga tgggtgagtg tcttggccca 420 

ggatgggaga gccgcctgco ctggcatggg agggaggctg gtgtgacaga ggggctgggg 480 

atccccgttc tgggaatggg gattaaaggc acccagtgtc cccgagaggg cctoaggrtgg 540 

50 tagggaacag catgtctcct gagcccgctc tgtcoocagc cctgcsagctg ctgctgtggc 600 

acagtgcact ctggacagtg caggaagcca cccccctggg ccctgccago tccctgcccc 660 

agagcttcct gctcaagtgc ttagagcaag tgaggaagat ccagggogat ggcgcagcgc 720 

tccaggagaa gctggtgagt gaggtgggtg agagggctgt ggagggaago ccggtgggga 780 

gagctaaggg ggatggaact gcagggccaa catcctctgg aagggacatg ggagaatatt 840 

aggagcagtg gagctgggga aggctgggaa gggacttggg gaggaggacc ttggtgggga 900 

55 cagtgctcgg gagggctggc tgggatggga gtggaggcat cacattcagg agaaagggca 960 
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agggoccctg tgagatcaga gagtgggggt gcagggcaga gaggaactga acagoctggc 1020 

aggacatgga gggaggggaa agaccagaga gtcggggagg acccgggaag gagcggcgao 1080 

ccggccacgg cgagtctcac tcagcatcct tccatcocca gtgtgccacc tacaagctgt 1140 

5 gccaccccga ggagctggtg ctgctcggac actctctggg catcccctgg gctcccctga 1200 

gcagctgccc cagccaggcc ctgcagctgg tgagtgtcag gaaaggataa ggotaatgag 1260 

gagggggaag gagaggagga acacccatgg gctccoccat gtctccaggt tccaagctgg 1320 

gggcctgacg tatctcaggc agcaccccct aactcttccg ctctgtctca caggcaggct 1380 

gottgagcca actccatagc ggccttttcc tctaccaggg gctcatgcag gccctggaag 1440 

10 ggatctcccc cgagttgggt cccaccttgg acacactgca gctggacgtc gccgactttg 1500 

ccaccaocat ctggcagcag gtgagccttg ttgggcaggg tggccaaggt cgtgctggca 1560 

ttctgggcac cacagccggg cctgtgtatg ggccotgtcc atgctgtcag cccccagcat 1620 

ttcctcattt gtaataacgc ccaotcagaa gggcocaacc actgatcaca gctttccccc 1680 

acagatggaa gaaotgggaa tggcccctgc cctgcagccc acccagggtg ccatgccggc 1740 

cttcgcctct gctttccagc gccgggcagg aggggtcctg gttgcotccc atctgcagag 1800 

15 cttcctggag gtgtcgtacc gogttotacg ccaccttgcc cagccctgag ocaagccctc 1860 

cccatcccat gtatttatct ctatttaata tttatgtcta tttaagcctc atatttaaag 1920 

acagggaaga gcagaaogga gccccaggcc tctgtgtcct tccctgcatt tctgagtttc 1980 

attctcctgc atgtagcagt gagaaaaagc tcctgtcctc ccatcocctg gactgggagg 2040 

tagataggta aataccaagt atttattact atgactgctc cccagccctg gctctgcaat 2100 

20 gggcactggg atgagccgct gtgagcccct ggtcctgagg gtccccacct gggacccttg 2160 

agagtatcag gtctcccacg tgggagacaa gaaatccctg tttaatattt aaacagcagt 2220 

gttcaccatc tgggtccttg cacccctcac tctggcctca gccgactgca oagcggcccc 2280 

tgcatoccct tggctgtgag gcooctggac aagcagaggt ggccagagct gggaggoatg 2340 

gccctggggt: cccacgaatt tgctggggaa totcgttttt cttcttaaga cttttgggac 2400 

25 atggtttgac tcccgaacat caccgacgtg tctcctgttt ttctgggtgg cctogggaca 2460 

cctgccctgc ccccaogagg gtcaggactg tgactctttt tagggccagg caggtgcctg 2520 

gacatttgcc ttgctggatg gggaotgggg atgtgggagg gagcagacag gaggaatcat 2580 

gtcaggcctg tgtgtgaaag gaagctccac tgtcaccctc cacctcttca ccccccactc 2640 

accagtgtcc cctccactgt cacattgtaa otgaaottca ggataataaa gtgtttgoot 2700 

ccagtcacgt ccttcctcct tcttgagtcc agctggtgcc tggccagggg ctggggaggt 2760 

C ggctgaaggg tgggagaggc cagagggagg tcggggagga ggtctgggga ggaggtccag 2820 
ggaggaggag gaaagttctc aagttcgtct gacattcatt ccgttagcac atatttatct 
gagcacctac tctgtgcaga cgctgggcta agtgctgggg acacagcagg gaacaaggca 
gacatggaat otgcactcga 

35 <210> 14 

<211> 850 

<212> DNA 

40 

<213> Homo sapiens 



<220> 

45 

<2 2 1 > mi 8 c_f ea tur e 

<222> (3).. (4) 

50 <223> n*a f c, g or t 



2880 
2940 
2960 



55 
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<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 



mi s gJl ea tur o 

(9) . . (9) 

n=*a, c, g or t 

misc_feature 

(11).. (11) 
n«=a, c, g or t 

mis cofeature 
(18).. (18) 
n=a, c, g or t 

misc_£eature 
(202) . . (202) 
n=a, c, g or t 



mi8c_foature 
(205) . . (205) 
n=?a f c, g or t 

mi 8 cofeature 
(273) . . (273) 
h=a, c, g or t 
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<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 



misc_foaturo 
(327) . . (327) 
n«a, c, g or t 



misc_feature 
(367) . . (367) 
n=a, c, g or t 

misc feature 
(581) (581) 
n»a, c, g or t 



mi s cofeature 
(599) . . (599) 
n=a, c, g or t 



mi sc_f ea ture 
(628) . . (628) 
n=a, c, g or t 

misc_feature 
(673) . • (673) 
n=a, c, g or t 
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<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 



mis cofeature 
(675) . , (675) 
n=a, Of g or t 



misc_featura 
(682) . . (682) 
n=a, c, g or t 

misc_featuro . 
(693) . . (693) 
n»a, c, g or t 

misc_feature 
(698) . . (698) 
n=a, c, g or t 

mi 8 c_f oa ture 
(700) . . (700) 
n=a, c, g or t 

misc^foaturo 
(720) . . (720) 
n=a, c, g or t 
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<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 



mis c_f oa ture 
(730) . . (730) 
n=a, c, g or t 

misc_feature 
(734) . . (734) 
n«a, c, g or t 



mis cofeature 
(742). . (743) 
n=a, c, g or t 



misc_feature 
(746).. (746) 
n=a, c, g or t 

misc_f«ature 
(748). . (748) 
n=*a, c, g or t 

misc_featur© 
(752) . . (752) 
n=a, a, g or t 
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<220> 

<221> mi8C_ feature 

<222> (762) . . (762) 

<223> n=a, c,.g or t 

<220> 

<221> misc_f©ature 

<222> (767) . • (767) 

<223> n«a, c, g or t 

<220> 

<221> misc_foaturo 

<222> (777) . . (777) 

<223> n=a, c, g or t 

<220> 

<221> misc_£eature 

<222> (783), . (784) 

<223> n=a f c, g or t 

<220> 

<221> misc_reature 

<222> (789) . . (789) 

<223> n=a r c, g or t 

<220> 

<221> mi s c_f ea tur o 

<222> (794) . . (794) 

<223> n-a, c, g or t 
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<220> 

<221> misc_feature 

<222> (797) . . (798) 

<223> n=a, c, g or t 

<220> 

<221> miscjfeature 

<222> (803) . . (805) 

<223> n=a, c, g or t 

<220> 

<221> mis cofeature 

<222> (810) . . (810) 

• <223> n=a, c, g or t 

<220> 

<221> misc^foature 

<222> (817) . . (817) 

<223> n=a, c, g or t 

<220> 

<221> mi8C_foatura 

<222> (826) . . (827) 

<223> n=a, c, g or t 

<220> 

<221> misc__feature 

<222> (831) . . (832) 

<223> n=a, c, g or t 
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<220> 

<221> mis cofeature 

5 <222> <834) . . (834) 

<223> n=a, c, g or t 

10 <220> 

<2 2 1 > mi s c_f ea tura 

<222> (837) . . (838) 

15 

<223> n=a, c, g or t 
<220> 

20 <221> misc_faatura 

<222> (840) . . (840) 

<223> n»a, c, g or t 

25 

<220> 

<22I> misc_faatura 

30 <222> (844) . . (844) 



35 <220> 

<221> misc_£eatura 

<222> (846) . . (848) 

40 

<223> n»a, c, g or t 



<400> 14 

ttnnotttnt ngccatgncc agttcaactc agcctctcag ttccacacgg acaacatgcg 60 

45 ggaccctctg aaccgagtcc tggccaacct gttcctgctc atctcctcca tcctggggtc 120 

tcgcaccgct ggcccccaca cccagttcgt gcagtggttc atggaggagt gtgtggactg 180 

cctggagcag ggtggccgtg gnagngtcct gcagttcatg cccttcacca ccgtgtcgga 240 

actggtgaag gfcgtcagcca tgtctagccc canggtggtt ctggccatca cggacotcag 300 

cctgcccctg ggccgccagg tggctgntaa agccattgct gcactctgag gggcttggca 360 

so tggccgnagt gggggotggg gactggcgca gccccaggcg cctccaaggg aagcagtgag 420 

gaaagatgag gcatcgtgcc tcacatccgt tccacatggt gcaagagcct ctagcggctt 480 

ccagttcccc gctcctgact cctgactcca ggatgtctcc cggtttcttc ttttcaaaat 540 

tttcctctcc atcttgctgg caactgagga gagtgagoag nctggaccac aagcccagng 600 

ggtcacccct gtgttgcgcc cgcccagncc aggagtagtc ttacctcttg aggaactttc 660 

ttggatggaa agngngtttt tntgtgttgt gtntgtgnan gtgtttttcg gggttttttn 720 

55 gggcaatatn ttangggaat cnnccntncg cncatttttt cnttagagct ccccggngga 780 
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aanntcttna tccnctnnct; ttnnnctccn tcacctncct tctttnntct nn tnttnncn 840 



850 



tccncnnncc 
5 <210> 15 

<211> 2309 
<212> DKA 

w 

<213> Homo sapiens 



l5 <400> 15 

coccgggcgc aggaggcggg cggcccggoc ccaccggccc cccatggacg cccccagcac 60 

ggggcgctga gacccccgcg tcgctgccca gcccggtccg gcgcgccacg ccagggatct 120 

ctggacagga caagactccg aagctactcc cccagcacac agcccgggac ccacaaaccc 180 

agcttgcccc cagccctccc acctgccact ccctggccac tcccaccgcc cgcccccctt 240 

ggggcgcagg gcatggtgtg aaaggccaag tgctgaggcg ggtatcatgg gfcgctgtgoc 300 

20 ctagggcctg ggtggcaggg ggtgggtggc ctgtgggtgt gccggggggg ccagtgtgcc 360 

420 



960 
1020 
1080 



oaccccagtc tcttggcgtg otggagggca tcctggatgg aattgaagtg aatggaacag 

aagccaagca aggtggagtg tgggtcagac ccagaggaga acagtgccag gtcaccagat 480 

ggaaagcgaa aaagaaagaa cggcoaatgt tccctgaaaa ccagcatgtc agggtatatc 540 

cctagttacc tggacaaaga cgagcagtgt gtcgtgtgtg gggacaaggc aactggttat 600 

25 cactaocgct gtatcacttg tgagggctgc aagggcttct ttcgccgcac aatccagaag 660 

aacctccatc ccacctattc ctgcaaatat gacagotgct gtgtcattga caagatcaoo 720 

cgcaatcagt gccagctgtg ccgcttcaag aagtgcatcg ccgtgggcat ggccatggac 780 

ttggttctag atgactcgaa gcgggtggcc aagcgtaagc tgattgagca gaaccgggag 840 

oggcggcgga aggaggagat gatccgatca ctgcagcagc gaccagagcc cactcctgaa 900 
30 gagtgggatc tgatccacat tgccacagag gcccatcgca gcaccaatgc ccagggcagc 
cattggaaac agaggcggaa attcctgccc gatgacattg gacagtcacc cattgtctcc 
atgccggacg gagacaaggt ggacctggaa gcattcagcg agtttaccaa gatcatcacc 

ccggccatca cccgtgtggt ggactttgcc aaaaaactgc ccatgttcta cgagctgcct 1140 

tgcgaagacc agatcatcct cctgaagggg tgctgcatgg agatoatgtc cctgcgggcg 1200 

gotgtcogct acgaccctga gagcgacacc ctgacgctga gtggggagat ggctgtcaag 1260 

35 cgggagcagc tcaagaatgg cggcctgggc gtagtctccg acgccatctt tgaactgggc 1320 

aagtcactct ctgcctttaa cctggatgac acggaagtgg ctctgctgca ggctgtgctg 1380 

ctaatgtcaa cagaccgctc gggcctgctg tgtgtggaca agatcgagaa gagtcaggag 1440 

gcgtacctgc tggcgttcga gcactacgtc aaccaacgca aacacaacat tccgcacttc 1500 

tggcccaagc tgctgatgaa ggagagagaa gtgcagagtt cgattctgta caagggggca 1560 

40 gcggcagaag gccggccggg cgggtcactg ggcgtccacc cggaaggaca gcagcttctc 1620 

ggaatgcatg ttgttcaggg tccgcaggtc cggcagcttg agcagcagct tggtgaagcg 1680 

ggaagtctcc aagggccggt tcttcagcac cagagcccga agagcccgca gcagcgtctc 1740 

ctggagctgc tccaccgaag cggaattctc catgcccgag cggtctgtgg ggaagacgac 1800 

agcagtgagg cggactcccc gagctcctct gaggaggaac cggaggtctg cgaggacctg 1860 

45 goaggcaatg cagcctctcc ctgaagcccc ccagaaggcc gatggggaag gagaaggagt 1920 

gccatacctt ctcccaggcc tctgccccaa gagcaggagg tgcctgaaag ctgggagcgt 1980 

gggctcagca gggctggtca cctcccatcc cgtaagacca ccttccottc ctcagcaggc 2040 

caaacatggc cagactccct tgctttttgc tgtgtagttc cctctgcctg ggatgccctt 2100 

ccccctttct ctgcctggca acatcttact tgtcctttga ggccccaact caagtgtcac 2160 

ctccttcccc agatccccca ggcagaaata gttgtctgtg cttccttggt tcatgcttct 2220 

50 actgtgacac ttatctcact gttttataat tagtcgggca tgagtctgtt tcccaagcta 2280 

gactgtgtct gaatcatgtc tgtatcccg 2309 
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w 



<210> 16 

<211> 2355 

<212> DNA 

<213> Homo sapiens 



<400> 16 

ccgttgcctc aacgtccaac ccttctgcag ggctgcagtc cggccacccc aagaccttgc 60 

tgcagggtgc ttcggatcct gatcgtgagt cgcggggtcc actccccgcc cttagccagt 120 

goccaggggg caacagcggc gatcgcaacc tctagtttga gtcaaggtcc agtttgaatg 180 

15 accgctotca gctggtgaag acatgaccac ootggactcc aacaacaaca caggtggcgt 240 

catcacotao attggctcca gtggctcctc cccaagoogo accagccctg aatccctcta 300 

tagtgacaac tccaatggca gcttccagtc cctgacccaa ggctgtccca cctacttccc 360 

accatccccc actggctccc tcacccaaga cccggctcgc tcctttggga gcattccacc 420 

cagcctgagt gatgacggct ccccttcttc ctcatattcc tcgtcgtcat cctcctcctc 480 

20 cttctataat gggagccccc ctgggagtct acaagtggcc atggaggaca gcagoogagt 540 

gtoocccagc aagagoacca gcaacatcac caagctgaat ggcatggtgt tactgtgtaa 600 

agtgtgtggg gacgttgcct cgggottcca ctacggtgtg ctcgcctgcg agggotgcaa 660 

gggctttttc cgtcggagca tccagcagaa catccagtac aaaaggtgtc tgaagaatga 720 

gaattgctcc atcgtccgca tcaatcgcaa ccgctgccag caatgtcgct tcaagaagtg 780 

25 tctctctgtg ggcatgtctc gagaogotgt gcgttttggg cgcatcccca aacgagagaa 840 

gcagcggatg cttgotgaga tgcagagtgc catgaacctg gccaacaacc agttgagcag 900 

ccagtgcccg ctggagactt cacccaccca gcaccccacc ccaggcccca tgggcccctc 960 

gccaccccct gctccggtcc cctcacccct ggtgggcttc tcccagtttc cacaacagct 1020 

gacgcctccc agatccccaa gccctgagcc cacagtggag gatgtgatat cccaggtggc 1080 

cogggcccat cgagagatct tcacctacgc ccatgacaag ctgggcagct cacctggcaa 1140 

30 cttcaatgcc aaccatgcat caggtagccc tocagccacc acoccacatc gctgggaaaa 1200 

tcagggctgc ccacctgccc ccaatgacaa caacaccttg gctgcccagc gtoataacga 1260 

ggccctaaat ggtctgcgcc aggctccctc ctcctaccct cccacctggc ctcctggooc 1320 

tgcacaccac agctgccaco agtccaacag caacgggcac cgtctatgcc ccacccacgt 1380 

gtatgcagcc ccagaaggca aggcacctgc caacagtccc cggcagggca actcaaagaa 1440 

35 tgttctgctg gcatgtccta tgaacatgta cccgcatgga cgcagtgggc gaacggtgca 1500 

ggagatotgg gaggatttct ccatgagctt cacgcccgct gtgcgggagg tggtagagtt 1560 

tgccaaacac atcoogggct tccgtgacct ttctcagcat gaccaagtca ccctgcttaa 1620 

ggctggcacc tttgaggtgc tgatggtgcg ctttgcttcg ttgttcaacg tgaaggacca 1680 

gacagtgatg ttcctaagcc ggaccaccta cagcctgcag gagcttggtg ccatgggcat 1740 

40 gggagacctg ctcagtgcca tgttcgaott cagcgagaag ctcaactcoo tggcgcttac 1800 

cgaggaggag ctgggcctct tcaccgcggt ggtgcttgtc tctgcagacc gctcgggcat 1860 

ggagaattcc gcttcggtgg agcagatcca ggagacgctg ctgcgggotc ttcgggctct 1920 

ggtgctgaag aaccggccct tggagacttc ccgcttcacc aagctgctgc tcaagctgcc 1980 

ggacctgcgg accctgaaca acatgcattc cgagaagctg ctgtocttcc gggtggacgc 2040 

ccagtgaccc gcccggccgg ccttctgccg ctgcoccctt gtacagaatc gaaotctgca 2100 

1 cttctctctc ctttacgaga cgaaaaggaa aagcaaacca gaatcttatt tatattgtta 2160 
taaaatattc caagatgagc ctctggcccc ctgagccttc ttgtaaatac ctgcctccct 
cccccatcac cgaacttccc ctcctcccct atttaaacca ctctgtctcc cccacaaccc 

tcccctggcc ctctgatttg ttctgttcct gtctcaaatc caatagttca cagctaaaaa 2340 

aaaaaaaaaa aaaag 2355 



2220 
2280 



50 



55 
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<210> 17 

<211> 4119 

<212> DNA 

<213> Homo sapiens 



10 



<400> 17 

gaattccgtt gctgtcgcac acacacacac acacacacac acaccccaac acacacacac 60 

acaccccaac acacacacac acacacacac acacacacac acacacacac acacagcggg 120 

atggocgagc gccgcacgcg tagcacgccg ggactagcta tcoagcctcc cagcagcctc 180 

15 tgcgacgggc gcggtgcgta agtacctcgc oggtggtggc cgttctccgt aagatggcgg 240 

accggcggcg gcagcgcgct tcgcaagaca ccgaggacga ggaatctggt gottcgggct 300 

ccgacagcgg cggctccccg ttgcggggag gogggagctg cagcggtagc gccggaggcg 360 

gcggcagcgg ctotctgcct tcacagcgcg gaggccgaac cggggccatt catctgcggc 420 

gggtggagag cgggggcgcc aagagtgctg aggagtcgga gtgtgagagt gaagatggca 480 

20 ttgaaggtga tgctgttctc tcggattatg aaagtgcaga agactcggaa ggtgaagaag 540 

gtgaatacag tgaagaggaa aactccaaag tggagctgaa atcagaagct aatgatgctg 600 

ttaattcttc aacaaaagaa gagaagggag aagaaaagcc tgacaocaaa agcactgtga 660 

otggagagag gcaaagtggg gacggacagg agagcacaga gcctgtggag aacaaagtgg 720 

gtaaaaaggg ccctaagcat ttggatgatg atgaagatcg gaagaatcca gcatacatac 780 

ctcggaaagg gctcttcttt gagcatgatc ttcgagggca aactcaggag gaggaagtca 840 

25 gacccaaggg gcgtcagcga aagctatgga aggatgaggg tcgctgggag catgacaagt 900 

tccgggaaga tgagcaggcc ccaaagtccc gacaggagct cattgctctt tatggttatg 960 

acattcgctc agctcataat cctgatgaca tcaaacatcg aagaatccgg aaaccccgat 1020 

atgggagtcc tccacaaaga gatccaaact ggaacggtga gcggctaaac aagtctcatc 1080 

gccaccaggg tcttgggggc accctaccac daaggacatt tattaaoagg aatgctgcag 1140 

30 gtaccggccg tatgtctgca cccaggaatt attctcgatc tgggggcttc aaggaaggtc 1200 

gtgctggttt taggcctgtg gaagctggtg ggcagcatgg tggccggtct ggtgagactg 1260 

ttaagcatga gattagttac cggtcacggc gcatagagca gacttctgtg agggatccat 1320 

ctccagaagc agatgctcca gtgcttggca gtcctgagaa ggaagaggca gcctcagagc 1380 

caccagctgc tgctcctgat gctgcaccac caccccctga taggcccatt gagaagaaat 1440 

cctattcccg ggcaagaaga actcgaacca aagttggaga tgcagtcaag cttgcagagg 1500 

35 aggtgccccc tcctcctgaa ggactgattc cagcacctcc agtcccagaa accaccccaa 1560 

ctccacctac taagactggg acctgggaag ctccggtgga ttctagtaca agtggacttg 1620 

agcaagatgt ggcacaacta aatatagcag aacagaattg gagtccgggg cagccttctt 1680 

tcctgcaacc acgggaaott cgaggtatgc ccaaccatat acacatggga gcaggacotc 1740 

cacctcagtt taaccggatg gaagaaatgg gtgtccaggg tggtcgagcc aaacgctatt 1800 

40 catccoagcg gcaaagacct gtgccagagc cccccgccco tccagtgcat atcagtatca 1860 

tggagggaoa ttactatgat ccactgcagt tccagggacc aatctatacc catggtgaca 1920 

gccctgcccc gctgcctcca cagggcatgc ttgtgcagcc aggaatgaac cttccccacc 1980 

caggtttaca tccccaccag acaccagctc ctctgcccaa tccaggcctc tatcccccac 2040 

cagtgtccat gtctccagga cagccaccac ctcagcagtt gcttgctcct acttactttt 2100 

ctgctccagg cgtcatgaac tttggtaatc ccagttaccc ttatgctcca ggggcactgc 2160 

45 ctcccccacc accgcctcat ctgtatccta atacacaggo cocatcacag gtatatggag 2220 

gagtgaccta ctataacccc gcccagcagc aggtgcagcc aaagccctcc ccaccccgga 2280 

ggactcccca gccagtcacc atcaagcccc ctccacctga ggttgtaagc aggggttcca 2340 

gttaatacaa gtttctgaat attttaaatc ttaacatcat ataaaaagca gcagaggtga 2400 

gaactcagaa gagaaataca gctggctatc tactaccaga agggcttcaa agatataggg 2460 

50 tgtggctcct accagcaaac agctgaaaga ggaggacccc tgccttcctc tgaggacagg 2520 

ctctagagag agggagaaac aagtggacct cgtcccatct tcactcttca cttgagttgg 2580 

ctgtgttcgg gggagcagag agagccagac agccccaagc ttctgagtct agatacagaa 2640 

gcccatgtct tctgctgttc ttcacttctg ggaaattgaa gtgtcttctg ttcccaagga 2700 

agctccttcc tgtttgtttt gttttctaag atgttcattt ttaaagcctg gcttcttatc 2760 

cttaatatta ttttaatttt ttctctttgt ttctgtttct tgctctctct ccctgccttt 2820 

55 aaatgaaaca agtctagtct tctggttttc tagcccctct ggattccctt ttgactcttc 2880 
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cgtgcatccc agataatgga gaatgtatca 
ctggcctttc acttttagtt ggcatttgtt 
ctaaccctgt ggaagcatgg ctgtctgcac 

5 gtaggtgggt aggagccctt ctctttgact 

tacctgtact atgatgggct tctgttctct 
gatggcacac cagatgcttt tgtgagaaag 
gggtgtgtat tcacatagtc ctcagggcto 
cttgcttctc ttctttccat tcttcttgct 

10 goatctcttt aaaggcaaat attatccagc 

cttctccacg tctttcctgc tacaagtgtt 
tctatttttg tccttgcaga cagaatataa 
agtcaccttc tgggcaaggg ctcctatctt 
tctgctgcct ctgtggaaga gattcctatt 
ctggccactg tccctgtcct tctacagaac 

15 cccggtaatg tcactgtttt tattccttcc 

acaggtgtgg gatggagagt ggggagaggc 
actaagagat tcttctaggg gtagctggtg 
cttaaacctg aagatgtctc ctcaagcotg 
agaoactgtg agagttgtct ctgttggtcc 

2Q tgtggtggtg ttttttgtta ctgttttaaa 

tggagataat aaaatttaga ctataaactt 

<210> 18 

<211> 2653 

25 

<212> DNA 

<213> Homo sapiens 



gccagccttc cccaccaagt ctaaaaagac 2940 

atcctcttgt atacttgtat tcccttaact 3000 

agagggtccc attgtgoaga aaagctcaga 3060 

taggttttta ggagtctgag catccatcaa 3120 

gctgagggcc aataccctac tgtggggaga 3180 

ggatggtgga gtgagagcct ttgcctttag 3240 

agtcttttga ggtaagtgga attagagggo 3300 

acaccccttt tccagttgct gtggaccaat 3360 

aagcagtcta ccctgtcctt tgcaattgct 3420 

ttagatgtta otaccttatt ttccccgaat 3480 

aaactcctgg gcttaaggcc taaggaagcc 3540 

tcctccctat ccatggcact aaaccacttc 3600 

actgcagtao atacgtctgc caggggtaac 3660 

ctgagggcaa agatggtggc tgtgrtctctc 3720 

atctagoagc tggcctaatc actctgagtc 3780 

acttaatotg taacccccaa ggaggaaata 3840 

gttgtgcctt ttgtaggctg ttccattltgc 3900 

tgggcagcat gcccagattc ccagacctta 3960 

aotgtgttta gttgcaagga tttttccatg 4020 

gggtgcccat ttgtgatcag cattgtgact 4080 

gaaaaaaaa 4119 



<400> 18 

gagcgcggct ggagtttgct gotgccgctg 
gtccgagagg ctgcgtgtga gagacgtgag 
gaggattgct cgaggaggcc tggggtctgt 

35 ttccggcgag gcctgagotg tgctgtcgto 

acaatcagtt ttccaaaaag gaagctgtct 
gatgccaaac tagaaooaac aaatgtcoaa 
ctgcctctca gccccaggaa acgtctgggc 
cctccttgtt ctccaccaaa gcaaggcaag 

40 ottaagggac gaagattggt atttgacaat 

gaactagoca aagttcacca aaacaaaata 
acaacaaatt ctgagcagag atgtccactg 
aagcaagaag gcacttgota ocagcaagca 
cggctgcctg ccagggaaag ggagatggat 
tgtgggaaaa aagctggaag cctttacctt 

45 tgcttaagcc ggattctgca agacctcaag 

ctgaattgca tgtccttgag gaotgcccag 
tgtcaggaag aggtatccag gccagotggg 
atgaotgcag agaagggccc catgattgtg 
agcaaaggcc aggatgtatt gtacacgcta 

50 ttggtgctga ttggtattgc taataccctg 

caagctagag aaaaatgtaa gccacagotg 
atagtcacta ttttgcaaga tcgacttaat 
gctgcagttc aattctgtgc ccgcaaagte 
ctggatgttt gcaggagagc tattgaaatt 

55 ctcaaaccac tgtctgaatg taaatcacct 

cttattcaca tatcccaagt catctcagaa 



tgcagtttgt tcaggggott gtggcggtga 60 

aaggatcctg cactgaggag gtggaaagaa 120 

gagaoagcgg agctgggtga aggctgoggg 180 

atgcctoaaa cccgatccca ggoaoaggct 240 

cgggoattga aoaaagctaa aaaotccagt 300 

accgtaacct gttctcctcg tgtaaaagcc 360 

gatgacaaoo tatgcaacac tccccattta 420 

aaagagaatg gtccccctca ctcacataca 480 

cagctgacaa ttaagtotcc tagcaaaaga 540 

ctttcttcag ttagaaaaag tcaagagatc 600 

aagaaagaat ctgcatgtgt gaga c tat tc 660 

aagctggtcc tgaaoacagc tgtcccagat 720 

gtcatcagga atttcttgag ggaacacatc 780 

tctggtgctc ctggaactgg aaaaactgcc 840 

aaggaactga aaggctttaa aactatcatg 900 

gctgtattcc cage tat tgc tcaggagatt 960 

aaggacatga tgaggaaatt ggaaaaacat 1020 

ttggtattgg acgagatgga tcaactggac 1080 

tttgaatggc catggctaag caattctcac 1140 

gatctcacag atagaattct acctaggctt 1200 

ttgaacttcc caccttatac cagaaatcag 1260 

caggta t eta gaga tcaggt tctggacaat 1320 

tctgctgttt caggagatgt tegcaaagea 1380 

gtagagtcag atgtcaaaag ccagactatt 1440 

tctgagcctc tgattcccaa gagggttggt 1500 

gttgatggta acaggatgac ettgagecaa 1560 
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gagggagcac aagattcctt ccctcttcag cagaagatct tggtttgctc tttgatgctc 1620 

ttgatcaggc agttgaaaat caaagaggtc actctgggga agttatatga agcctacagt 1680 

aaagtctgtc gcaaacagca ggtggcggct gtggaccagt cagagtgttt gtcactttca 1740 

gggctcttgg aagccagggg cattttagga ttaaagagaa acaaggaaac ccgtttgaca 1800 

aaggtgtttt tcaagattga agagaaagaa atagaacatg ctctgaaaga taaagcttta 1860 

attggaaata tcttagctac tggattgcct taaattcttc tcttacaccc cacccgaaag 1920 

tattcagctg gcatttagag agctacagtc ttcattttag tgctttacac attcgggcct 1980 
gaaaacaaat atgacctttt ttacttgaag ccaatgaatt ttaatctata gattctttaa 
tattagcaca gaataatatc tttgggtctt actattttta cocataaaag tgaccaggta 



2040 
2100 
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gacccttttt aattacattc actacttcta ccacttgtgt atctctagcc aatgtgcttg 2160 
caagtgtaca gatctgtgta gaggaatgtg tgtatattta cctcttcgtt tgctcaaaca 2220 
tgagtgggta tttttttgtt tgtttttttt gttgttgttg tttttgaggc gcgtctcacc 2280 
ctgttgcoca ggctggagtg caatggcgcg ttctctgctc actacagcac ccgcttccca 2340 
ggttgaagtg attctcttgc ctcagcctcc cgagtagctg ggattacagg tgcccaccac 2400 
f5 cgcgcccagc taatttttta atttttagta gagacagggt tttaccatgt tggccaggct 2460 

ggtcttgaac tcctgaccct caagtgatct gcccaccttg gcctccctaa gtgctgggat 2520 
tataggcgtg agccaccatg ctoagocatt aaggtatttt gttaagaact ttaagtttag 2580 
ggtaagaaga atgaaaatga tooagaaaaa tgcaagcaag tccaoatgga gatttggagg 2640 
acactggtta aag 2653 

20 

<210> 19 
<211> 2907 
<212> DHA 

25 

<213> Homo sapiens 



30 <400> 19 



60 



gocatctggg cccaggcccc atgccccgag gaggggtggt ctgaagccca ccagagcccc 

ctgccagact gtctgcctcc cttctgactg tggccgcttg gcatggocag caacagcagc 120 

tcctgcccga cacctggggg cgggcacctc aatgggtacc cggtgcctcc ctacgccttc 180 

ttcttcccoc otatgctggg tggactctco ccgccaggcg ctctgaccac tctccagcac 240 

cagcttcoag ttagtggata tagcacacca tccccagcca ccattgagac ooagagcagc 300 

35 agttctgaag agatagtgcc cagccctccc tcgccacccc ctctaccccg catctaoaag 360 

ccttgctttg tctgtcagga caagtcotca ggctaccact atggggtcag cgcctgtgag 420 

ggctgcaagg gcttcttccg ccgcagcatc cagaagaaca tggtgtacac gtgtcaccgg 480 

gacaagaact gcatcatcaa caaggtgacc cggaaccgct gcoagtactg ccgactgcag 540 

aagtgctttg aagtgggcat gtccaaggag tctgtgagaa acgaccgaaa caagaagaag 600 

40 aaggaggtgc ccaagcccga gtgctctgag agctacacgc tgacgccgga ggtgggggag 660 

ctcattgaga aggtgcgcaa agcgcaccag gaaaccttcc ctgccctctg ccagctgggc 720 
aaatacaota cgaacaacag ctcagaacaa cgtgtctctc tggacattga cctctgggac 
aagttoagtg aactctccac caagtgoatc attaagactg tggagttcgc caagcagctg 
cccggcttca ccaccctcac catcgccgac cagatcaccc tcctcaaggc tgcctgcctg 
gacatcatga tcctgcggat ctgcacgcgg tacaogcccg agcaggacac oatgaccttc 

45 tcggacgggc tgaccctgaa ccggacccag atgcacaacg ctggcttcgg ccccctcacc 1020 

gacctggtct ttgccttogc caaccagctg ctgcccctgg agatggatga tgcggagacg 1080 

gggctgctca gcgccatctg cctcatctgc ggagaccgcc aggacctgga gcagccggac 1140 

cgggtggaca tgctgcagga gccgctgctg gaggcgctaa aggtctacgt gcggaagogg 1200 

aggcccagcc gcccccacat gttccccaag atgctaatga agattactga cctgcgaagc 1260 

so atcagcgcca agggggctga gcgggtgatc acgctgaaga tggagatccc gggctccatg 1320 

ccgcctctca tccaggaaat gttggagaac tcagagggcc tggacactct gagcggacag 1380 

ccggggggtg gggggcggga cgggggtggc ctggcccccc cgccaggcag ctgtagcccc 1440 

agcctcagcc ccagctccaa cagaagoagc ccggccaccc actccccgtg accgcccacg 1500 

ccacatggac acagccctcg ccctccgccc cggcttttot: ctgcctttct accgaccatg 1560 

tgaccccgoa ccagccctgc ccccacctgc cctcccgggc agtactgggg accttccctg 1620 

55 OTggacgggg agggaggagg cagcgactcc ttggacagag gcctgggccc tcagtggact 1680 



780 
840 
900 
960 
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gcctgctccc acagcctggg ctgacgtcag aggccgaggc caggaactga gtgaggcccc 1740 

tggtcctggg tctcaggatg ggtcctgggg gcctcgtgtt catcaagaca cccctctgcc 1800 

cagotcacca ca tot tea to accagcaaac gecaggaett ggctccccca tcctcagaac 1860 

tcacaagcca ttgctcccca gctggggaac ctoaaootcc cccctgcctc ggttggtgac 1920 

agagggggtg ggacaggggc ggggggttcc ccctgtacat accctgccat accaacccca 1960 

ggtattaatt ctcgctggtt ttgtttttat tttaattttt ttgttttgat ttttttaata 2040 

agaattttca ttttaagcac atttatactg aaggaatttg tgotgtgtat tggggggago 2100 

tggatccaga gctggagggg gtgggtccgg gggagggagt ggctcggaag gggcccccac 2160 

totcctttca tgtccctgtg ccccccagtt c tec tec tea gccttttcct cctcagtttt 2220 

ctctttaaaa ctgtgaagta ctaactttco aaggcctgcc ttcccctccc tcccaetgga 2280 

gaagccgcca gcccctttct ccctctgect gaccactggg tgtggacggt gtggggcagc 2340 

cotgaaagga caggctcctg gecttggcae ttgectgeae ocaecatgag gcatggagca 2400 

gggcagagca agggccccgg gacagagttt tcccagacct ggctcotegg cagagct-gcc 2460 

tcccgtcagg gcecacatca tctaggctcc ccagocccca ctgtgaaggg getggecagg 2520 

15 ggcccgagct gcccccaccc ccggcctcag ccaccagcac ccccataggg cccccagaca 2580 

ccacacaoat gcgcgtgcgc acacacacaa acacaeacac actggacagt agatgggecg 2640 

acacacactt ggcccgagtt cctceatttc cctggcctgc cceccacccc caacctgtcc 2700 

cacccccg/tg eccectcctt accccgcagg aegggcetae aggggggtct ccectoacco 2760 

ctgeaccccc agctggggga gctggctctg ceecgaectc cttcaceagg ggttggggco 2820 

ccttcccctg gagcccgtgg gtgcacctgt tactgttggg etttccactg agatctaetg 2880 

gataaagaat aaagttctat ttattct 2907 

<210> 20 

<211> 2096 

<212> DNA 

<213> Homo sapiens 

30 

<220> 

<221> mis cofeature 

35 <222> (23).. (23) 

<223> n=a, a, g or t 

40 <220> 

<221> misc_feature 

<222> (27) . . (27) 

45 <223> n^a, c, g or t 

<220> 

so <221> mi8C_featura 

<222> (80) . . (80) 

<223> n=a, c, g or t 

55 
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45 



50 



<220> 

<221> mis cofeature 

<222> (120) . . (120) 

<223> n«a, c, g or t 

<400> 20 

agatgtttaa aaatactttg atnctcngtt tccacctctc ttaaattgtc 

ttaaatatac agtcatcacn ttgotgaaaa aagttcgcaa tgagaacaat 

tggctgtaac taggtcaggc gcggttgctc atgcctgtaa tcccaccact 



gttcttaaaa atttggcata 



ctgtggctcc tcgggcaaaa tctgtacggg 
cagatgaaga tgatctgttt taaaatgtga 
30 gcccaagact ggttttaaag ttacctgaag 
tggggaaggt gtttttagta caagacatca 
tttttataat aotgtataaa tagtgaccat 
tctgtgtttt gagtctgctt cttttgtctt 



atagaacttt atggttctag tacagatact ctactacact cagcctctta 



tttccctatg 


60 


catctaaaan 


120 


ttgggaggcc 


180 


atggtggaat 


240 


ctgtaatccc 


300 


ggttgcattg 


360 


atotcaaaaa 


420 


ctccactacc 


480 


gaattctggt 


540 


gcoatccact 


600 


aagcaaggtg 


660 


ctaagccatt 


720 


tagottttac 


780 


ggaagagttg 


840 


tatagtgtct 


900 


gttacctggg 


960 


tttgactcag 


1020 


ctggaagagt 


1080 


atcttaccaa 


1140 


gaatttagtt 


1200 


gttctttagc 


1260 


ctotgctttg 


1320 


cttctgaact 


1380 


catccaotaa 


1440 


aotttcagta 


1500 


ttaccatcag 


1560 


tgtgccaagt 


1620 


tcagaggccg 


1680 


agaaaattag 


1740 


tgagtctgaa 


1800 


cggggacaac 


1860 


gctattagat 


1920 


gtctacattg 


1980 


ctatgttttt 


2040 


ttgaaa 


2096 



<210> 21 

<211> 2160 

<212> DNA 

<213> Homo sapiens 
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<400> 21 

agccccotgc ccctcgccgc cccccgccgc ctgcctgggc cgggccgagg atgcggcgca 60 

gcgcctcggc ggccaggctt gctcccctcc ggcacgcctg otaacttccc ccgctacgtc 120 

cccgttcgcc cgccgggccg ccccgtctcc ccgcggcctc cgggtccggg tcotccagga 180 

cggccaggcc gtgccgccgt gtgccctccg ccgctcgccc gcgcgc cgcg cgctccccgc 240 

ctgcgcccag cgccccgcgc cogcgcccca gtcctcgggc ggtccatgct gcccctctgc 300 

otcgtggccg ccotgctgct ggccgccggg ccogggccga gcctgggcga cgaagccatc 360 

cactgcccgc cctgctccga ggagaagctg gcgcgctgcc gcccccccgt gggctgcgag 420 

gagctggtgc gagaggcggg ctgcggctgt tgcgccactt gogccctggg ctfcggggatg 480 

ccctgcgggg tgtacacccc ccgttgcggc tcgggcatgc gctgctaccc gccccgaggg 540 

gtggagaagc ccctgcacac actgatgcac gggcaaggcg tgtgcatgga gctggcggag 600 

atcgaggcca tccaggaaag cctgcagccc tctgacaagg acgagggtga ccaccccaac 660 

aacagcttca gcccctgtag cgcccatgac cgcaggtgcc tgcagaagca cttcgccaaa 720 

attcgagacc ggagcaccag tgggggcaag atgaaggtca atggggcgcc ccgggaggat 780 

15 gcccggoctg tgccccaggg ctcctgccag agcgagctgc aacgggcgct ggagcggctg 840 
gacgcttcac agagccgcao coacgaggac ctctaattca tccccatccc caactgcgac ( 900 

cgcaacggca acttcoaccc caagcagtgt cacccagctc tggatgggca gcgtggcaag 960 

tgctggtgtg tggaccggaa gacgggggtg aagcttccgg ggggcctgga gccaaagggg 1020 

gagctggact gccaccagct ggctgacagc tttcgagagt gaggcctgcc agcaggcaag 1080 

ggactcagcg tcccctgcta ctcctgtgct ctggaggctg cagagctgac ccagagtgga 1140 

gtctgagtct gagtcctgte totgcctgcg gcccagaagt ttccct c aaa tgcgcgtgtg 1200 

cacgtgtgcg tgtgcgtgcg tgtgtgtgtg tttgtgagca tgggtgrtgcc cttggggtaa 1260 

gccagagcct ggggtgttot ctttggtgtt acacagccca agaggactga gaotggcact 1320 

tagcccaaga ggtctgagcc' ctggtgtgtt tccagatcga tcctggattc actcactcac 1380 

tcattccttc actcatccag ccacctaaaa acatttactg accatgtact acgtgccagc 1440 

25 tctagttttc agccttggga ggttttattc tgaattccta tgattttggc atgtggagac 1500 

actcctataa ggagagttca agcctgtggg agtagaaaaa tctcattccc agagtcagag 1560 

gagaagagac atgtaccttg accatcgtcc ttcctotcaa gctagcccag agggtgggag 1620 

cctaaggaag cgtggggtag cagatggagt aatggtcacg aggtccagac coactcccaa 1680 

agotoagact tgccaggctc cctttctctt cttccccagg tccttccttt aggtctggtt 1740 

gttgcaccat ctgcttggtt ggctggcagc tgagagcoct gctgtgggag agcgaagggg 1800 

gtcaaaggaa gacttgaagc acagagggct agggaggtgg ggtacatttc tctgagcagt 1860 

cagggtggga agaaagaatg caagagtgga ctgaatgtgc ctaatggaga agacccacgt 1920 

gotaggggat gaggggcttc ctgggtcctg ttcccctacc ccatttgtgg tcacagccat 1980 

gaagtcaccg ggatgaacct atccttccag tggctcgotc cctgtagctc tgcotccctc 2040 

tccatatctc ottcccotac aoctccctcc ccacacctcc ctactcccct gggcatcttc 2100 

tggcttgaat ggatggaagg agaottagga acctaccagt tggccatgat gtcttttctt 2160 

<210> 22 

<211> 2215 

<212> DNA 

<213> Hoao sapiens 



30 



35 



40 



45 

<400> 22 

ctgcagggag ccatgattgc accactgcac tacagcctgg gcaacagagt gagaccatgt 

ctcaagaaaa aaaaaaaaga aagaaaccac tgotctaggc taaatcccag ccagagttgg 

agccacccag ctaaactggc ctgttttccc tcatttcctt ccccgaaggt atgcctgtgt 

caagatgagg tcacggacga ttacatcgga gacaacacca cagtggacta cactttgttc 

gagtctttgt gctccaagaa ggacgtgogg aactttaaag cctggttcct ccctatcatg 

tactccatca tttgtttcgt gggcctactg ggcaatgggc tggtcgtgtt gacctatatc 

tatttcaaga ggctcaagac catgaccgat acctacctgc tcaacctggo ggtggoagac 420 

atcctcttcc tcctgaccct tcccttctgg gcctacagcg cggccaagtc ctgggtcttc 480 

ggtgtccact tttgcaagct catctttgcc atctacaaga tgagcttctt cagtggcatg 540 

55 c tec tact tc tttgcatcag cattgaccgc tacgtggcca tcgtccaggc tgtctcagct 



50 



60 
120 
180 
240 
300 
360 



600 
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720 



caocgccacc gtgcccgcgt ccttctcatc agcaagctgt cctgtgtggg catctggata 
ctagccacag tgctctccat cccagagctc ctgtacagtg acctccagag gagcagcagt 

5 gagcaagcga tgcgatgctc tctcatcaca gagcatgtgg aggcctttat caccatccag 780 

gtggcccaga tggtgatcgg ctttctggtc cccctgctgg ccatgagctt ctgttacctt 840 
g tea tea tec gcaccctgct ccaggeacgc aactttgagc gcaacaaggc catcaaggtg 
ateategctg tggtcgtggt cttcatagtc ttccagctgc cctacaatgg ggtggtcctg 
goocagaegg tggecaaett caacatcacc agtagcacct gtgagctcag taagcaactc 
aacatcgcct aegaegtcac ctacagcotg gcctgcgtcc getgetgegt caaacctttc 

10 ttgtacgcct teateggegt caagttccgc aacgatctct tcaagctctt caaggacctg 1140 

ggotgectea gecaggagea gctccggcag tggtcttcct gteggcaoat ccggcgctcc 1200 

tccatgagtg tggaggcega gaccaccacc accttctccc cataggegae tcttctgcct 1260 

ggaotagagg gacctctccc agggtccctg gggtggggat agggagoaga tgcaatgact 1320 

caggacatcc ccccgccaaa agetgetoag ggaaaagcag ctctcccctc agagtgcaag 1380 

Is'' ccctgctcca gaagttagct tcaccccaat cccagctacc tcaaccaatg ccgaaaaaga 1440 

cagggctgat aagctaacac cagacagaca acactgggaa acagaggota ttgtecccta 1500 

aaccaaaaac tgaaagtgaa agtccagaaa ctgtteccac ctgctggagt gaaggggeca 1560 

aggagggtga gtgcaagggg cgtgggagtg goctgaagag tcctctgaat gaaocttctg 1620 

gcctcccaca gactcaaatg ctcagaccag ctcttccgaa aaccaggcct tatctocaag 1680 

20 aocagagata gtggggagac ttcttggatt ggtgaggaaa agoggacatc agctggtcaa 1740 

acaaactctc tgaacccctc cctccafccgt tttcttoact gtoctocaag ecagegggaa 1800 

tggcagctgc cacgccgccc taaaagcaca ctcatcccct caottgdege gtcgccctcc 1860 

caggctctca acaggggaga gtgtggtgtt tcctgcaggc caggccagct gcctccgcgt 1920 

gatcaaagee acactctggg ctccagagtg gggatgacat gcactcagct cttggotcca 1980 

otgggatggg aggagaggac aagggaaatg teaggggegg ggagggtgao agtggccgcc 2040 

25 caaggccacg agcttgttct ttgttctttg tcacagggac tgaaaacctc tcctcatgtt 2100 

ctgetttega ttcgttaaga gagcaacatt ttacccacac acagataaag ttttcccttg 2160 

aggaaacaac agctttaaaa gaaaaaagaa aaaaaaagct tggtaagtoa agtag 2215 



30 



35 



50 



<210> 23 

<211> 958 

<212> DNA 

<213> Homo sapiens 



900 
960 
1020 
1080 



<400> 23 

ggggceggae gcgaggggcg gggegagege gggacaaagg gaagegaage eggagctgeg 60 

40 ggcgcttttt ctgcccgcgg tgtctcagat tcattcttaa ggaactgaga acttaatctt 120 

ccaaaatgtc aaaaagacca tcttatgccc cacctcccac cccagotoot gcaacacaaa 180 

tgcccagcac accagggttt gtgggataca atccatacag tcatctcgcc tacaacaact 240 

acaggctggg agggaacccg agcaccaaca gccgggtcac ggcatcotct ggtatcacga 300 

ttccaaaacc cccaaagcca ccagataagc egotgatgee ctacatgagg tacagcagaa 360 

45 aggtctggga ccaagtaaag gcttccaacc ctgacctaaa gttgtgggag attggcaaga 420 

ttattggtgg catgtggcga gatctcactg atgaagaaaa acaagaatat ttaaacgaat 480 

acgaagcaga aaagatagag tacaatgaat ctatgaaggc ctatcataat tcccccgcgt 540 

accttgetta cataaatgea aaaagtcgtg cagaagctgc tttagaggaa gaaagtcgac 600 

agagacaatc tegcatggag aaaggagaac cgtacatgag cattcagcct gctgaagatc 660 

cagatgatta tgatgatggc ttttcaatga agcatacagc caccgcccgt ttccagagaa 720 

accaccgcct catcagtgaa attcttagtg agagtgtggt gecagaegtt eggtcagrttg 780 

tcacaacagc tagaatgeag gtcctcaaac ggcaggtcca gtccttaatg gttcatcagc 840 

gaaaactaga agctgaactt cttcaaatag aggaacgaca ccaggagaag aagaggaaat 900 

tcctggaaag cacagattca tttaacaatg aacttaaaag gttgtgcggt ctgaaagt 958 



55 
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<210> 24 

<211> 6483 

<212> DNA 

<213> Homo s apian s 



actttgggag gccaaggcag gcagatcaag aggtgaggag attgagacca tcctggotaa 
catgatgaaa ccctgtctct accaaaaaga caaaaaatta gccaggtatg gtggcaaaca 
cctgtagtcc cage tac teg ggaggctgag gcaggagaat ggcgtgaaca cgggaggtgg 
agottgeagt gagecgagat ggcgccactg cactccagcc tgggcgacag agcaagcctc 



60 



<400> 24 

aagcttctaa ttgcagttca accacctgtt acatatcttc agg a aaaaat cacaacctct 
caacttcaac ttcctcttct ataaattaga aataacaata accacacctg taaccccagc 120 

•180 

^ Aofnafnaaa «r!nfrrf.f!W»r nrnaaftflflfffl caaaaaatta accaaatato ctaacacaca 240 

300 
360 



480 
540 
600 
660 



cgtctaaaaa aaaaaaaaga aagaaagaaa gaaagaaaga aaagaaataa taataaccac 420 
cattcctatc tcaacagatt gttctagaaa tttttaaagc acagtatcao aaacagcact 
20 acataattgt aaaacatgta tgaatatata catccaaaca acagcaatgt catagectat 
gggtagatat aatcttatac aatgtaccaa aatcccaatt taottcacta gaoaaaotgt 
tataccaaat tctgtacaca gtatatccaa gaaaatgtgt tgtttttatt gagaaactga 

acctagcttg ggaacacatg tgeacagtet agttcataat atttggtgca agtatcattc 720 

totaatatag atttacattt ttgcaagcaa atttttactt geaategtaa catatccaaa 780 

ttttcccttt ttactcaatc agaaottagt gtaaagtact acaagttagt tetteggatt 840 

25 teatgetaag aaaataatgc agattttctg cattattatg gtcttcacag aaaccttaac 900 

tatgatgaat ttaaaagtgc aaaataatcc aggataactt tatgatttca cattttttaa 960 

tgttaaaaat aatgecatea ttaattagaa aattctaaaa tcattacttc cactttctta 1020 

1080 



ggoaaaatat caatatactc teatttgeca aataaattaa aagatctcot acaaacacaa 
totcctaaat tgtggtttta tggotttaat gttttatgtg tggcaactat tgatgctagt 1140 
taaaatttta gaaactcttt ctttttgatt ccctacagtt gtctacaaga acattattgt 1200 
agcatgatcc tgecagaett tatactattt gttgotccaa ttaaaactgt ttaaaacatg 1260 
aatttgaaaa atottatttt aactataatt ttgtagctga aacttttttt tctaaacttt 1320 
gcaaacattc tatgeaaect gaattagtgc tgagaaaatt ggatcttaat ggttgctcaa 1380 
tgttottcaa caggtgaaaa gcataataaa acatgetcat ctgaactcca cccattttca 1440 
atttcaacat agcatacotc gtgtttattc ttagggcaaa ttcaaaattg tacatattag 1500 
35 gattggttat tactgaagat aatttatgea ateataagee aaagatgeta agttggcaaa 1560 

aagaaaacaa tgtaagtaag caaaotctaa cacatgtgga cacaacctct cagtatataa 1620 
aggcttgtca ctgtccttgg tagcaggcac tccctgggct aaacagcatc accatgtctg 1680 
ttcgatacag ctcaagcaag cactaotctt cctcccgcag tggaggagga ggaggaggag 1740 

1800 
I860 
.920 
L980 



gaggatgtgg aggaggagga ggagtgtcat coctaagaat ttctagcagc aaaggotccc 
ttggtggagg atttagctca ggggggttca gtggtggctc ttttagccgt gggagctctg 
gtgggggatg ctttgggggc tcatcaggtg gctatggagg attaggaggt tttggtggag 
gtagctttca tggaagctat ggaagtagca gctttggtgg gagttatgga ggcagctttg 

gagggggcaa tttcggaggt ggcagctttg gtgggggcag ctttggtgga ggcggctttg 2040 

gtggaggcgg ctttggagga ggctttggtg gtggatttgg aggagatggt ggccttctct 2100 

ctggaaatga aaaagtaacc atgeagaate tgaatgaccg cctggcttcc tacttggaca 2160 

45 aagttcgggc tctggaagaa tcaaactatg agctggaagg caaaatcaag gagtggtatg 2220 

aaaagcatgg caactcacat cagggggagc ctegtgacta cagcaaatac tacaaaacca 2280 

tcgatgacct taaaaatcag gtaagaggta tttttaaatc cagctttaag tatcttgtcc 2340 

atgtaatcca gacagatgaa tcttaaatta agcacaatgt ggctgttcac tatgettace 2400 

catgttactt tcttccttca aaaataaccc agtctcatca aagataaaca tctgtgaaac 2460 

tatggtcatg gcaatcttca tccagcaagt gtgctacttg tcttaagagg atgggagatt 2520 



2580 



tactaagcac ttttgaggtt ttaatgagca tacaatgagt ccacagttaa aatatgetag 
gctatttaca aatgtagaaa ctgaaaaaaa aaatcatgat atgaatcaga acaaaatgtt 2640 

2700 



attcagactg ataacaagee atattcagta ccaacatggc aagaaaaata aattttccag 

tatgaaaatg ggacactget tgcttctaag gaatttctga attgtaccta ttgtgtacca 2760 

gttcagagtg tatttattta ttagtattta tcatgagtta aacaaatgea ggtgtgagtc 2820 

55 agecaaagea tggctgaaat acatggaaat cacatagtct aaaagaggag ggoacactta 2880 
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4140 
4200 
4260 



caggaataca tctatataat tccagttagt tttcagaaag gaataattcg tgtacagaaa 2940 

tacaagactg gagaaattcc aagagaacaa ataattcaaa gttaagtata tgggtaagcc 3000 

tgcaatattt catatttaaa ataaaaaatt ttcccaagat tttgtaagag aacaacataa 3060 

5 aagtgcagag tgcatctatg tcactacaaa agccatatct gcatctgacc tcttctcaaa 3120 

taactgtgcc tctccotaca gattctcaac ctaacaactg ataatgccaa catcctgctt 3180 

cagatcgaca atgccaggct ggcagctgat gacttcaggc tgaagtaagt taagtgatcg 3240 

ttgtataata ctatcacaac gaatacatca gtggttttta acaatgactt gggatgccct 3300 

caataacatt tacatttttc tgaattcacc caaagttaaa tagtattgga gttatctgag 3360 

io aaattttcca tgtcagtgtt acctttttgg caatattaaa ggaagaaaat gcatattaaa 3420 

gtaactgcta aggttttttc cattaaacca ctattacttc taagagaact gtacatgaca 3480 

aatattgcca ttacatgaga tcaactatgt agttgctttt taaatagtct ctgcccagat 3S40 

acatctcccc tatataagtt ataaccagta ttgatatcat gcttgtttca ggtatgagaa 3600 

tgaggtagct ctgcgccaga gcgtggaggc tgaoatcaac ggcctgcgta gggtgctgga 3660 

tgagctgacc ctgaccaagg ctgacctgga gatgcaaatt gagagcctga ctgaagagct 3720 

15 ggcctatctg aagaagaacc acgaggaggt gacacaaaag ttatactttt ccoagccaaa 3780 

agagagttca ttatggtcct cgtgtagcca ataaatcttt ctgttcctca aacaggaaat 3840 

gaaagacctt cgaaatgtgt ccactggtga tgtgaatgtg gaaatgaatg ctgccccggg 3900 

tgttgatctg actcaactto tgaataacat gagaagccaa tatgaacaac ttgctgaaca 3960 

aaaccgcaaa gatgctgaag cctggttcaa tgaaaaggta aagtaatctt oottatagtg 4020 

20 aaactcatgg aggttttatc atttcagaat ttcctcaocc ttttccttgt ttttaatact 4080 

ctagagcaag gaactgacta cagaaattga taataacatt gaacagatat ccagctataa 
atctgagatt actgaattga gacgtaatgt acaagctctg gagatagaac tacagtccca 
actggccttg gtatgttaao tctcatgaaa tgacttcaac tttatcatac aaagtttcat 

gctcacotaa gaatatgcaa tgcaacaaaa aaatgcagag ttggaggtaa gaaagagaaa 4320 

acaaagtgaa gctcatgtta atggaggaaa agtactacta gtgttgatct aaaagtgctg 4380 

25 aaactgaaat ggtgccatta aacatacaac aaattctgtt cattttctta ttcttctata 4440 

taatgoctta ctaaataatc aaataagcgt caeca tactc aaotgaacaa ggaagtcact 4500 

aagccacaaa aaaatccgtt tcagaaaoaa tccctggaag cctccttggc agaaacagaa 4560 

ggtegctact gtgtgcagct ctcacagatt cacgcccaga tatccgctct ggaagaacag 4620 

ttgeaacaga ttcgagotga aaccgagtgc cagaatactg aataccaaoa actcotggat 4680 

30 attaagatcc gactggagaa tgaaattoaa acctaccgca gectgetaga aggagaggga 4740 

aggtaaatta taacatgaaa agttatccca gtttctttta ttcaatattc cagatagcaa 4800 

ggattatcta aaccccaaga agatgecaga gaatgagagg aagggaggag agagggtaga 4860 

gtacagaaaa aggagtaege aaccgcaatc tcactttctc atgaatttgg cccaaaatga 4920 

ttcttaagag ttctgtgaac ttaacattgt tttcaaagga tgggttttaa aatatatacc 4980 

tggcagggtt ttattttttc aacacgtttt gottattttc taaattaacg gcaactggaa 5040 

35 agctacccac cgttttccaa cgttagagat aaccgaatgt gacctcaccc cgtttagttc 5100 

eggaggegge ggacgcggcg gcggaagttt eggeggeggo tacggcggcg gaagctcegg 5160 

eggeggaage tccggcggcg getaeggegg cggccaaggc ggcagttccg geggeggcta 5220 

eggaggegga agctccggcg geggaagetc cggcggcggc tacgggggcg gaagctccag 5280 

cggcggccac ggcggcggaa gctccagcgg cggccacggc ggcagttcca geggeggcta 5340 

40 cggtggtggc agttccggcg geggeggegg eggctaeggg ggeggcaget: ccggcggcgg 5400 

eagcagetec ggeggeggat aeggeggegg cagotccagc ggaggecaca agtcctcctc 5460 

ttccgggtcc gtgggcgagt cttcatctaa gggaccaagg tcagcagaaa ctagctgggg 5520 

taatctagaa ttagttttaa cttcctgtga tggttttttt gcgctttaag ctctagagtt 5580 

gttttaaaaa attaaaaatc ttagagaegg ttccgtttgc atttgttcac aaactactct 5640 

taacaccagc cgtgaaaaat ggcatgatca aaatgtcata ccttaagcat ttttttgggc 5700 

45 ttaacaatgt aaagttgaaa tttccttctt tttacaatat ttgcttgtta attactaagg 5760 

atccctacag actgtttaaa attttttttc catcattcac acagatacta acaaaaccag 5820 

agtaatcaag acaattattg aagaggtggc gcccgacggt agagttcttt catctatggt 5880 

tgaatcagaa accaagaaac actactatta aactgeatea agaggaaaga gtctccctto 5940 

acacagacca ttatttacag atgcatggaa aacaaagtct ccaagaaaaa acttctgtct 6000 

so tgatggtcta tggaaataga ccttgaaaat aaggtgtcta caaggtgttt tgtggtttct 6060 

gtatttcttc ttttcacttt accacaaagt gttctttaat ggaaagaaaa acaactttgt 6120 

gttctcattt actaatgaat ttcaataaac tttcttactg atgeaaacta tcccaatttg 6180 
tcagaattta tctttactta agtacataat actctttaaa attaaagatt agtaacccat 
agcagttgaa ggttgatgta tccagaaatt eggaagacag aactattgtc atgectttte 
taagtttttt aatcatgtat gttcagacca ccgtcagtaa attcactgag taaagtctgt 

55 aaatccccaa tattactctt taagatacac aatatgtgga aggctcccag ctctctggct 6420 



6240 
6300 
6360 
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10 



20 



25 



30 



ttaaattatt tcaatcctgg aaattctgga atatctcaaa tataaccccc aaaataataa 
taa 

<210> 25 

<211> 1871 

<212> DNA 

<213> Homo sapiens 



6480 
6483 



15 <400> 25 

agttgtggcc accttoccca ggocatggat ctctccaaca acaccatgfcc actctcagtg 60 

cgoaccocog gactgtcccg gcggctctcc tcgcagagtg tgataggcag acccaggggc 120 

atgtctgctt ccagtgttgg aagtggttat gggggaagtg octttggctt tggagocagc 180 

tgtgggggag gcttttctgc tgottccatg tttggttota gttcoggott tgggggtggc 240 

tcoggaagtt ccatggcagg aggactgggt gctggttatg ggagagccct gggtggaggt 300 

agctttggag ggctggggat gggatttggg ggcagcccag gaggtggctc tctaggtatt 360 

ctctcgggca atgatggagg ccttctttct ggatcagaaa aagaaaotat gcaaaatctt 420 

aatgatagat tagcttccta cctggataag gtgcgagctc tagaagaggc taatactgag 480 

otagaaaata aaattogaga atggtatgaa acacgaggaa ctgggaotgc agatgattca 540 

cagagcgatt acagcaaata ttatccactg attgaagacc tcaggaataa gatcatttca 600 

gccagcattg gaaatgccca gctcctcttg cagattgaca atgcgagact agctgctgag 660 

gacttcagga tgaagtatga gaatgaactg gccctgogcc agggcgtaga ggccgaoatc 720 

aatggcctgc gccgggtgct ggacgagotg accctgacca. ggaccgacct ggagabgoag 780 

atcgagagcc tgaacgagga gctggcctao atgaagaaga adcacgagga tgagctccaa 840 
agottccggg tgggcggccc aggcgaggtc agcgtagaaa tggacgctgc caccggagtg 
gacctcacca ggctcctcaa tgatatgcgg gcgcagtatg aaaccatogo tgagcagaat 
cggaaggacg ctgaagcotg gttcattgaa aagagcgggg agotccgtaa ggagattagc 
accaacacog agcagcttca gtccagcaag agcgaggtca ccgacctgcg tcgcgccttt 

cagaacctgg agatcgagct acagtcccag ctcgccatga agaaatccct ggaggaotoc 1140 

ttggccgaag ccgagggcga ttactgcgcg cagctgtccc aggtgcagca gctcatcagc 1200 

aacctggagg cacagctgct ccaggtgcgc gcggacgoag agcgccagaa cgtggaccac 1260 

oagcggctgc tgaatgtcaa ggcccgoctg gagctggaga ttgagaccta cogccgcctg 1320 

ctggacgggg aggccoaagg tgatggtttg gaggaaagtt tatttgtgac agaotccaaa 1380 

tcacaagcac agtcaactga ttcotctaaa gacccaacca aaaccogaaa aatcaagaca 1440 

gttgtgcagg agatggtgaa tggtgaggtg gtctcatctc aagttoagga aattgaagaa 1500 

ctaatgtaaa atttcaoaag atctgcccoa tgattggtto cttaggaaca agaaatttac 1560 

40 aagtagaaat tattcctttc agagtaacat gctgtattac ttcaatccct atttttgtct 1620 

gttccatttt ctttggattc cctattcaca ttgaatoott tttgoccttc tgaaacaata 1680 

ttcagtcaoa agtcattttg gtcatgttgg tctttgtaac aaatcaaaat taccttatat 1740 

ccttctggac aactggagta gtcttttaac gaactttott ctggtaaccc ggaatatttt 1800 

cttaatcata gagctttact caagtagtat tgttttaata gagttaattg taataaaaga 1860 

45 tgaatggtaa a 1871 



900 
960 
1020 
1080 



<210> 26 

<211> 1447 

<212> DMA. 

<213> Homo sapiens 
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<400> 26 

ctgcaactgg ttctgcgagg gctccttcaa tggcagcgag aaggagacta tgcagttcct 
gaacgaccgc ctggccagct acctggagaa ggtgcgtcac gtggagcggg acaacgcgga 
gotggagaac ctcatccggg agoggtotca gcagcaggag cccttgctgt gccccagcta 
ccagtcotac ttcaagacca ttgaggagct ccagcagaag atootgtgca gcaagtctga 
gaatgccagg ctggtggtgc agatogacaa tgccaagctg gctgcagatg acttcagaac 
caagtaccag acggagcagt ccctgcggca gctggtggag tccgacatca acagcctgcg 
caggattctg gatgagctga ccctgtgcag gtctgacctg gaggcccaga tggagtccct 
gaaggaggag ctgctgtccc tcaagcagaa ccatgagcag gaagtcaaca ccttgcgctg 
ccagcttgga gaccgcctca acgtggaggt ggacgctgct cccgctgtgg acctgaacca 
ggtcctgaac gagaccagga atcagtatga ggccctggtg gaaaccaacc gcagggaagt 
ggagcaatgg ttcgccacgc agaccgagga gctgaacaag caggtggtat ccagctcgga 
gcagctgcag tcctaccagg cggagatcat ogagctgaga cgcacagtca atgccctgga 
gatcgagctg caggcccagc acaacctgcg atactctctg gaaaacacgc tgacagagag 
cgaggcoogc tacagctccc agctgtccca ggtgcagago ctgatcacca aogtggagtc 
ooagctggcg gagatccgca gtgacctgga gcggcagaac caggagtatc aggtgctgct 
ggacgtgcgg gcgcggctgg agtgtgagat caacacatac cggagcctgc tggagagcga 
ggactgcaag ctgccctcca acccctgcgc caccaccaat gcatgtgaaa agoccafetgg 
atcctgtgtc accaatcctt gtggtcctcg ttcccgctgt gggccttgca acacctttgg 
gtactagata ccctggggcc agcagaagta tagcatgaag acagaactac catcggtggg 
ccagttctgc ctctctgaca accatcagco accggacccc accccgaggc atcaccacaa 
atoatggtct ggaaggagaa caaatgccca gcgtttgggt ctgactctga gcctagggct 
actgatcctc ctcaccccag gtccctctcc tgtagtcagt ctgagttctg atggtcagag 
gttggagctg tgacagtggc atacgaggtg ttttgttctc tctgctgctt ctacctttat 
tgcagttccc caaatcgcct aataaacttt cctcttgcaa agcagacaaa aaaaaaaaaa 
aaaaaaa 

<210> 27 

<211> 261 

<212> PRT 

<213> Homo sapiens 



<400> 27 



Met 


Asn 


Pro Asn 


Cys 


Ala 


Arg 


Cys 


Gly 


Lys 


He Val Tyr 


Pro Thr Glu 


1 






5 








10 




15 


Lys 


Val 


Asn Cys 


Leu 


Asp 


Lys 


Phe 


Trp 


His 


Lys Ala Cys 


Phe His Cys 




20 










25 






30 


Glu 


Thr 


Cys Lys 


Met 


Thr 


Leu 


Asn 


Met 


Lys 


Asn Tyr Lys 


Gly Tyr Glu 






35 








40 






45 




Lys 


Lys 


Pro Tyr 


Cys 


Asn 


Ala 


His 


Tyr 


Pro 


Lys Gin Ser 


Phe Thr Met 


50 








55 








60 




Val 


Ala 


Asp Thr 


Pro 


Glu 


Asn 


Leu 


Arg 


Leu 


Lys Gin Gin 


Ser Glu Leu 


65 






70 










75 


80 


Gin 


Ser 


Gin Val 


Arg 


Tyr 


Lys 


Glu 


Glu 


Phe 


Glu Lys Asn 


Lys Gly Lys 








85 










90 




95 


Gly 


Phe 


Ser Val 


Val 


Ala 


Asp 


Thr 


Pro 


Glu 


Leu Gin Arg 


He Lys Lys 




100 










105 






110 


Thr 


Gin 


Asp Gin 


He 


Ser 


Asn 


He 


Lys 


Tyr 


His Glu Glu 


Phe Glu Lys 






115 








120 






125 




Ser 


Arg 


Met Gly 


Pro 


Ser 


Gly 


Gly 


Glu 


Gly 


Met Glu Pro 


Glu Arg Arg 




130 








135 








140 




Asp 


Ser 


Gin Asp 


Gly 


Ser 


Ser 


Tyr 


Arg 


Arg 


Pro Leu Glu 


Gin Gin Gin 


145 








150 










155 


160 
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Pro 


His 


His lie 


Pro Thr Ser Ala 








165 


Gin 


Pro 


Val Ala 


Gin Ser Tyr Gly 






180 




Val 


Ser 


He Gin 


Arg Ser Ala Pro 






195 


200 


Ala 


Val 


Tyr Asp 


Tyr Ser Ala Ala 




210 




215 


Asp 


Gly 


Asp Thr 


He Val Asn Val 


225 






230 


Tyr 


Gly 


Thr Val 


Glu Arg Thr Gly 






245 


Tyr 


Val 


Glu Ala 


He 




260 





<210> 28 

<211> 478 

<212> PRT 

<213> Homo sapiens 



Pro Val Tyr Gin Gin Pro Gin Gin 

170 175 
Gly Tyr Lys Glu Pro Ala Ala Pro 
185 190 
Gly Gly Gly Gly Lys Arg Tyr Arg 
205 

Asp Glu Asp Glu Val Ser Phe Gin 
220 

Gin Gin He Asp Asp Gly Trp Mat 
235 240 
Asp Thr Gly Met Leu Pro Ala Asn 
250 255 



<400> 28 

Met Val Gin Lys Thr Ser Met Ser Arg Gly Pro Tyr Pro Pro Ser Gin 

1 5 10 15 

Glu He Pro Met Glu Val Phe Asp Pro Ser Pro Gin Gly Lys Tyr Ser 

20 25 30 

Lys Arg Lys Gly Arg Phe Lys Arg Ser Asp Gly Ser Thr Ser Ser Asp 

35 40 45 

Thr Thr Ser Asn Ser Phe Val Arg Gin Gly Ser Ala Glu Ser Tyr Thr 

50 55 60 

Ser Arg Pro Ser Asp Ser Asp Val Ser Leu Glu Glu Asp Arg Glu Ala 
65 70 75 80 

Leu Arg Lys Glu Ala Glu Arg Gin Ala Leu Ala Gin Leu Glu Lys Ala 

85 90 95 

Lys Thr Lys Pro Val Ala Phe Ala Val Arg Thr Asn Val Gly Tyr Asn 

100 105 HO 

Pro Ser Pro Gly Asp Glu Val Pro Val Gin Gly Val Ala He Thr Phe 

115 120 125 

Glu Pro Lys Asp Phe Leu His He Lys Glu Lys Tyr Asn Asn Asp Trp 

130 135 140 

Trp He Gly Arg Leu Val Lys Glu Gly Cys Glu Val Gly Phe He Pro 
145 150 155 160 

Ser Pro Val Lys Leu Asp Ser Leu Arg Leu Leu Gin Glu Gin Lys Leu 

165 170 175 

Arg Gin Asn Arg Leu Gly Ser Ser Lys Ser Gly Asp Asn Ser Ser Ser 

180 185 190 

Ser Leu Gly Asp Val Val Thr Gly Thr Arg Arg Pro Thr Pro Pro Ala 

195 200 205 

Ser Ala Lys Gin Lys Gin Lys Ser Thr Glu His Val Pro Pro Tyr Asp 

210 215 220 

Val Val Pro Ser Met Arg Pro He He Leu Val Gly Pro Ser Leu Lys 
225 230 235 240 

Gly Tyr Glu Val Thr Asp Met Met Gin Lys Ala Leu Phe Asp Phe Leu 
245 250 255 
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Lys His Arg Phe Asp Gly Arg lie Sex lie Thr Arg Val Thr Ala Asp 

260 265 270 

lie Ser Lau Ala Lys Arg Ser Val Leu Asn Asn Pro Ser Lys His lie 

275 280 285 

lie lie Glu Arg Ser Asn Thr Arg Ser Ser Leu Ala Glu Val Gin Ser 

290 295 300 

Glu lie Glu Arg lie Phe Glu Leu Ala Arg Thr Leu Gin Leu Val Ala 
305 310 315 320 

Leu Asp Ala Asp Thr lie Asn His Pro Ala Gin Leu Ser Lys Thr Ser 

325 330 335 

Leu Ala Pro lie lie Val Tyr lie Lys lie Thr Ser Pro Lys Val Leu 

340 345 350 

Gin Arg Leu lie Lys Ser Arg Gly Lys Ser Gin Ser Lys His Leu Asn 

355 360 365 

Val Gin lie Ala Ala Ser Glu Lys Leu Ala Gin Cys Pro Pro Glu Met 

370 375 380 

Phe Asp lie lie Leu Asp Glu Asn Gin Leu Glu Asp Ala Cys Glu His 
385 390 395 400 

Leu Ala Glu Tyr Leu Glu Ala Tyr Trp Lys Ala Thr His Pro Pro Ser 

405 410 415 

Ser Thr Pro Pro Asn Pro Leu Leu Asn Arg Thr Met Ala Thr Ala Ala 

420 425 430 

Leu Arg Arg Ser Pro Ala Pro Val Ser Asn Leu Gin Val Gin Val Leu 

435 440 445 

Thr Ser Leu Arg Arg Asn Leu Gly Phe Trp Gly Gly Leu Glu Ser Ser 

450. 455 460 

Gin Arg Gly Ser Val Val Pro Gin Glu Gin Glu His Ala Met 
465 470 475 

<210> 29 

<211> 196 

<212> PRT 

<213> Homo sapiens 



<400> 29 




Met 


Ser 


Mot Leu 


Arg Leu Gin Lys 


1 






5 


Cys 


Gly 


Lys Lys 


Lys Val Trp Leu 






20 




Ala 


Asn 


Ala Asn 


Ser Arg Gin Gin 






35 


40 


Leu 


He 


He Arg 


Lys Pro Val Thr 




50 




55 


Lys 


Asn 


Thr Leu 


Ala Arg Arg Lys 


65 






70 


Arg 


Lys 


Gly Thr 


Ala Asn Ala Arg 








85 


Arg 


Arg 


Mat Arg 


He Leu Arg Arg 






100 




Lys 


Lys 


He Asp 


Arg His Met Tyr 






115 


120 


Gly 


Asn 


Val Phe 


Lys Asn Lys Arg 




130 




135 



Arg Leu Ala Ser Ser Val Leu Arg 

10 15 
Asp Pro Asn Glu Thr Asn Glu He 
25 30 
He Arg Lys Leu He Lys Asp Gly 
45 

Val His Ser Arg Ala Arg Cys Arg 
60 

Gly Arg His Met Gly He Gly Lys 

75 80 
Met Pro Glu Lys Val Thr Trp Mat 

90 95 
Leu Leu Arg Arg Tyr Arg Glu Ser 
105 HO 
His Ser Leu Tyr lieu Lys Val Lys 
125 

He Leu Mat Glu His He His Lys 
140 
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Leu Lyo Ala Asp Lye Ala Arg Lys Lys Lou Lou Ala Asp Gin Ala Glu 
145 150 155 160 

Ala Arg Arg Ser Lys Thr Lys Glu Ala Arg Lys Arg Arg Glu Glu Arg 

165 170 175 

Lou Gin Ala Lys Lys Glu Glu Ilo lie Lys Thr Lou Sor Lys Glu Glu 

180 185 190 

Glu Thr Lys Lys 





195 


<210> 


30 


<211> 


1566 


<212> 


PRT 


<213> 


Homo 



sapiens 



<400> 30 

Mot Sor Sor Lou Leu Glu Arg Lou His Ala Lys Phe Asn Gin Asn Arg 

1 5 10 15 

Pro Trp Ser Glu Thr Ilo Lys Leu Val Arg Gin Val Mot Glu Lyo Arg 

20 25 30 

Val Val Met Sor Sor Gly Gly His Gin His Lou Val Ser Cyo Leu Glu 

35 40 45 

Thr Lou Gin Lys Ala Lou Lys Val Thr Sor Lou Pro Ala Mat Thr Asp 

50 55 60 

Arg Lou Glu Sor lie Ala Gly Gin Asn Gly Leu Gly Ser His Lou Sor 
65 70 75 80 

Ala Ser Gly Thr Glu Cys Tyr lie Thr Ser Asp Met Phe Tyr Val Glu 

85 90 95 

Val Gin Leu Asp Pro Ala Gly Gin Lou Cys Asp Val Lys Val Ala His 

100 105 HO 

His Gly Glu Asn Pro Val Sor Cys Pro Glu Leu Val Gin Gin Leu Arg 

115 120 125 

Glu Lys Asn Ser Asp Glu Phe Ser Lys His Leu Lys Gly Lou Val Asn 

130 135 140 

Lou Tyr Asn Lou Pro Gly Asp Asn Lys Lou Lys Thr Lys Met Tyr Leu 
145 150 155 160 

Ala Leu Gin Ser Leu Glu Gin Asp Leu Ser Lys Mot Ala lie Met Tyr 

165 170 175 

Trp Lys Ala Thr Asn Ala Gly Pro Lou Asp Lys lie Leu His Gly Ser 

180 185 190 

Val Gly Tyr Leu Thr Pro Arg Ser Gly Gly His Leu Mot Asn Leu Lys 

195 200 205 

Tyr Tyr Val Sor Pro Sor Asp Leu Leu Asp Asp Lys Thr Ala Sor Pro 

210 215 220 

Ilo Ilo Leu His Glu Asn Asn Val Sor Arg Sor Leu Gly Mot Asn Ala 
225 230 235 240 

Sor Val Thr lie Glu Gly Thr Ser Ala Val Tyr Lys Leu Pro Ilo Ala 

245 250 255 

Pro Lou lie Met Gly Ser His Pro Val Asp Asn Lys Trp Thr Pro Ser 

260 265 270 

Phe Ser Sor Ilo Thr Ser Ala Asn Ser Val Asp Lou Pro Ala Cys Phe 

275 280 285 

Phe Leu Lys Phe Pro Gin Pro He Pro Val Ser Arg Ala Phe Val Gin 
290 295 300 
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Lys Lou Gin Asn Cys Thr Gly lie Pro Leu Phe Glu Thr Gin Pro Thr 
305 310 315 320 

Tyr Ala Pro Leu Tyr Glu Leu lie Thr Gin Phe Glu Leu Ser Lys Asp 

325 330 335 

Pro Asp Pro lie Pro Leu Asn His Asn Met Arg Phe Tyr Ala Ala Leu 

340 345 350 

Pro Gly Gin Gin His Cys Tyr Phe Leu Asn Lys Asp Ala Pro Leu Pro 

355 360 " 365 

Asp Gly Arg Ser Leu Gin Gly Thr Leu Val Ser Lys lie Thr Phe Gin 

370 375 380 

His Pro Gly Arg Val Pro Leu He Leu Asn Leu He Arg His Gin Val 
385 390 395 400 

Ala Tyr Asn Thr Leu He Gly Ser Cys Val Lys Arg Thr He Leu Lys 
405 410 415 

Glu Asp Ser Pro Gly Leu Leu Gin Phe Glu Val Cys Pro Leu Ser Glu 

420 425 430 

Ser Arg Phe Ser Val Ser Phe Gin His Pro Val Asn Asp Ser Leu Val 

435 440 445 

Cys Val Val Mat Asp Val Gin Gly Leu Thr His Val Ser Cys Lys Leu 

450 455 460 

Tyr Lys Gly Leu Ser Asp Ala Leu He Cys Thr Asp Asp Phe He Ala 
465 470 475 480 

Lys Val Val Gin Arg Cys Met Ser lie Pro Val Thr Met Arg Ala He 

485 490 495 

Arg Arg Lys Ala Glu Thr He Gin Ala Asp Thr Pro Ala Leu Ser Leu 

500 505 510 

He Ala Glu Thr Val Glu Asp Met Val Lys Lys Asn Leu Pro Pro Ala 

515 520 525 

Ser Ser Pro Gly Tyr Gly Met Thr Thr Gly Asn Asn Pro Met Ser Gly 

530 535 540 

Thr Thr Thr Ser Thr Asn Thr Phe Pro Gly Gly Pro He Ala Thr Leu 
545 550 555 560 

Phe Asn Met Ser Met Ser He Lys Asp Arg His Glu Ser Val Gly His 

565 570 575 

Gly Glu Asp Phe Ser Lys Val Ser Gin Asn Pro He Leu Thr Ser Leu 

580 585 590 

Leu Gin He Thr Gly Asn Gly Gly Ser Thr He Gly Ser Ser Pro Thr 

595 600 605 

Pro Pro His His Thr Pro Pro Pro Val Ser Ser Met Ala Gly Asn Thr 

610 615 620 

Lys Asn His Pro Met Leu Met Asn Leu Leu Lys Asp Asn Pro Ala Gin 
625 630 635 640 

Asp Phe Ser Thr Leu Tyr Gly Ser Ser Pro Leu Glu Arg Gin Asn Ser 

645 650 655 

Ser Ser Gly Ser Pro Arg Met Glu He Cys Ser Gly Ser Asn Lys Thr 

660 665 670 

Lys Lys Lys Lys Ser Ser Arg Leu Pro Pro Glu Lys Pro Lys His Gin 

675 680 685 

Thr Glu Asp Asp Phe Gin Arg Glu Leu Phe Ser Mat Asp Val Asp Ser 

690 695 700 

Gin Asn Pro He Phe Asp Val Asn Met Thr Ala Asp Thr Leu Asp Thr 
705 710 715 720 

Pro His He Thr Pro Ala Pro Ser Gin Cys Ser Thr Pro Pro Thr Thr 

725 730 735 

Tyr Pro Gin Pro Val Pro His Pro Gin Pro Ser He Gin Arg Met Val 

740 745 750 

Arg Leu Ser Ser Ser Asp Ser He Gly Pro Asp Val Thr Asp He Leu 
755 760 765 
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Ser Asp He Ala Glu Glu Ala Ser Lys Leu Pro Ser Thr Ser Asp Asp 

770 775 780 

Cys Pro Ala lie Gly Thr Pro Leu Arg Asp Ser Ser Ser Ser Gly His 
785 790 795 800 

Ser Gin Ser Thr Leu Phe Asp Ser Asp Val Phe Gin Thr Asn Asn Asn 

805 810 815 

Glu Asn. Pro Tyr Thr Asp Pro Ala Asp Leu He Ala Asp Ala Ala Gly 

820 825 830 

Ser Pro Ser Ser Asp Ser Pro Thr Asn His Phe Phe His Asp Gly Val 

835 840 845 

Asp Phe Asn Pro Asp Leu Leu Asn Ser Gin Ser Gin Ser Gly Phe Gly 

850 855 860 

Glu Glu Tyr Phe Asp Glu Ser Ser Gin Ser Gly Asp Asn Asp Asp Phe 
865 870 875 880 

Lys Gly Phe Ala Ser Gin Ala Leu Asn Thr Leu Gly Val Pro Mat Leu 

885 890 895 

Gly Gly Asp Asn Gly Glu Thr Lys Phe Lys Gly Asn Asn Gin Ala Asp 

900 905 910 

Thr Val Asp Phe Ser He He Ser Val Ala Gly Lys Ala Leu Ala Pro 
915 920 925 

Ala Asp Leu Met Glu His His Ser Gly Ser Gin Gly Pro Leu Leu Thr 

930 935 940 

Thr Gly Asp Leu Gly Lys Glu Lys Thr Gin Lys Arg Val Lys Glu Gly 
945 950 955 960 

Asn Gly Thr Ser Asn Ser Thr Leu Ser Gly Pro Gly Leu Asp Ser Lys 

965 970 975 

Pro Gly Lys Arg Ser Arg Thr Pro Ser Asn Asp Gly Lys Ser Lys Asp 

980 985 990 

Lys Pro Pro Lys Arg Lys Lys Ala Asp Thr Glu Gly Lys Ser Pro Ser 

995 1000 1005 

His Ser Ser Ser Asn Arg Pro Phe Thr Pro Pro Thr Ser Thr Gly 

1010 1015 1020 

Gly Ser Lys Ser Pro Gly Ser Ala Gly Arg Ser Gin Thr Pro Pro 

1025 1030 1035 

Gly Val Ala Thr Pro Pro He Pro Lys He Thr He Gin He Pro 

1040 1045 1050 

Lys Gly Thr Val Met Val Gly Lys Pro Ser Ser His Ser Gin Tyr 

1055 1060 1065 

Thr Ser Ser Gly Ser Val Ser Ser Ser Gly Ser Lys Ser His His 

1070 1075 1080 

Ser His Ser Ser Ser Ser Ser Ser Ser Ala Ser Thr Ser Gly Lys 

1085 1090 1095 

Met Lys Ser Ser Lys Ser Glu Gly Ser Ser Ser Ser Lys Leu Ser 

1100 1105 1110 

Ser Ser Met Tyr Ser Ser Gin Gly Ser Ser Gly Ser Ser Gin Ser 

1115 1120 1125 

Lys Asn Ser Ser Gin Ser Gly Gly Lys Pro Gly Ser Ser Pro He 

1130 1135 1140 

Thr Lys His Gly Leu Ser Ser Gly Ser Ser Ser Thr Lys Met Lys 

1145 1150 1155 

Pro Gin Gly Lys Pro Ser Ser Leu Met Asn Pro Ser Leu Ser Lys 

1160 1165 1170 

Pro Asn He Ser Pro Ser His Ser Arg Pro Pro Gly Gly Ser Asp 

1175 1180 1185 

Lys Leu Ala Ser Pro Met Lys Pro Val Pro Gly Thr Pro Pro Ser 

1190 1195 1200 

Ser Lys Ala Lys Ser Pro He Ser Ser Gly Ser Gly Gly Ser His 

1205 1210 1215 
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Mat Ser Gly Thr Ser Ser Ser Ser Gly Mat Lys Ser Sar Sar Gly 

1220 1225 1230 

Lou Gly Sar Sar Gly Sar Lau Sar Gin Lys Thr Pro Pro Sar Sar 

1235 1240 1245 

Asn Sar Cys Thr Ala Sar Sar Sar Sar Phe Sar Sar Ser Gly Sar 

1250 1255 1260 

Sar Mat Ser Ser Sar Gin Asn Gin His Gly Ser Ser Lys Gly Lys 

1265 1270 1275 

Ser Pro Ser Arg Asn Lys Lys Pro Ser Leu Thr Ala Val He Asp 

1280 1285 1290 

Lys Leu Lys His Gly Val Val Thr Ser Gly Pro Gly Gly Glu Asp 

1295 1300 1305 

Pro Leu Asp Gly Gin Met Gly Val Ser Thr Asn Ser Ser Ser His 

1310 1315 1320 

Pro Mat Ser Ser Lys His Asn Met Ser Gly Gly Glu Phe Gin Gly 

1325 1330 1335 

Lys Arg Glu Lys Ser Asp Lys Asp Lys Ser Lys Val Ser Thr Ser 

1340 ~ * 1345 1350 

Gly Ser Ser Val Asp Ser Ser Lys Lys Thr Ser Glu Ser Lys Asn 

1355 1360 1365 

Val Gly Ser Thr Gly Val Ala Lys He He He Ser Lys His Asp 

1370 1375 1380 

Gly Gly Ser Pro Ser lie Lys Ala Lys Val Thr Leu Gin Lys Pro 

1385 1390 1395 

Gly Glu Ser Ser Gly Glu Gly Leu Arg Pro Gin Met Ala Ser Ser 

1400 1405 1410 

Lys Asn Tyr Gly Ser Pro Leu He Ser Gly Ser Thr Pro Lys His 

1415 1420 1425 

Glu Arg Gly Ser Pro Ser His Ser Lys Ser Pro Ala Tyr Thr Pro 

1430 1435 1440 

Gin Asn Leu Asp Ser Glu Ser Glu Ser Gly Ser Ser He Ala Glu 

1445 1450 1455 

Lys Ser Tyr Gin Asn Ser Pro Ser Ser Asp Asp Gly He Arg Pro 

1460 1465 1470 

Leu Pro Glu Tyr Ser Thr Glu Lys His Lys Lys His Lys Lys Glu 

1475 v 1480 1485 

Lys Lys Lys Val Lys Asp Lys Asp Arg Asp Arg Asp Arg Asp Lys 

1490 1495 1500 

Asp Arg Asp Lys Lys Lys Ser His Ser He Lys Pro Glu Ser Trp 

1505 ^ 151Q 1515 

Ser Lys Ser Pro He Ser Ser Asp Gin Ser Leu Ser Met Thr Ser 

1520 1525 1530 

Asn Thr He Leu Ser Ala Asp Arg Pro Ser Arg Leu Ser Pro Asp 

1535 1540 1545 

Phe Mat He Gly Glu Glu Asp Asp Asp Leu Met Asp Val Ala Leu 

1550 1555 1560 

He Gly Asn 

1565 

<210> 31 

<211> 1490 

<212> PRT 
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Val Thr His Lou Asn Thr Glu Val Lys Asn Sor Ser Asp Thr Gly Lys 
465 470 475 480 

Val Lys Lou Asp Glu Asn Sor Glu Lys His Lou Val Lys Asp Lou Lys 

485 490 495 

Ala Gin Gly Thr Arg Asp Sor Lys Pro Ilo Ala Lou Lys Glu Glu lie 

500 505 510 

Val Thr Pro Lys Glu Thr Glu Thr Sor Glu Lys Glu Thr Pro Pro Pro 

515 520 525 

Lou Pro Thr Ilo Ala Sor Pro Pro Pro Pro Lou Pro Thr Thr Thr Pro 

530 535 540 

Pro Pro Gin Thr Pro Pro Lou Pro Pro Lou Pro Pro Ilo Pro Ala Lou 
545 550 555 560 

Pro Gin Gin Pro Pro Lou Pro Pro Sor Gin Pro Ala Pho Sor Gin Val 

565 570 575 

Pro Ala Sor Sor Thr Sor Thr Lou Pro Pro Sor Thr His Sor Lys Thr 

580 585 590 

Sor Ala Val Sor Sor Gin Ala Asn Sor Gin Pro Pro Val Gin Val Sor 

595 600 605 

Val Lys Thr Gin Val Sor Val Thr Ala Ala Ilo Pro His Lou Lys Thr 

610 615 620 

Ser Thr Lou Pro Pro Lou Pro Lou Pro Pro Lou Leu Pro Gly Gly Asp 
625 630 635 640 

Asp Mot Asp Ser Pro Lys Glu Thr Leu Pro Ser Lys Pro Val Lys Lys 

645 650 655 

Glu Lys Glu Gin Arg Thr Arg His Leu Leu Thr Asp Leu Pro Leu Pro 

660 665 670 

Pro Glu Leu Pro Gly Gly Asp Lou Ser Pro Pro Asp Ser Pro Glu Pro 

675 680 685 

Lys Ala He Thr Pro Pro Gin Gin Pro Tyr Lys Lys Arg Pro Lys He 

690 695 700 

Cys Cys Pro Arg Tyr Gly Glu Arg Arg Gin Thr Glu Ser Asp Trp Gly 
705 710 • 715 720 

Lys Arg Cys Val Asp Lys Phe Asp He He Gly He Ilo Gly Glu Gly 

725 730 735 

Thr Tyr Gly Gin Val Tyr Lys Ala Arg Asp Lys Asp Thr Gly Glu Lou 

740 745 750 

Val Ala Lou Lys Lys Val Arg Leu Asp Asn Glu Lys Glu Gly Pho Pro 

755 760 765 

He Thr Ala He Arg Glu He Lys He Leu Arg Gin Leu He His Arg 

770 775 780 

Ser Val Val Asn Met Lys Glu He Val Thr Asp Lys Gin Asp Ala Lou 
785 790 795 800 

Asp Phe Lys Lys Asp Lys Gly Ala Phe Tyr Leu Val Phe Glu Tyr Mat 

805 810 815 

Asp His Asp Leu Met Gly Leu Leu Glu Ser Gly Lou Val His Phe Ser 

820 825 830 

Glu Asp His He Lys Sor Phe Met Lys Gin Leu Mot Glu Gly Leu Glu 

835 840 845 

Tyr Cys His Lys Lys Asn Phe Leu His Arg Asp He Lys Cys Ser Asn 

850 855 860 

He Lou Leu Asn Asn Ser Gly Gin He Lys Leu Ala Asp Pho Gly Leu 
865 870 875 880 

Ala Arg Leu Tyr Asn Ser Glu Glu Ser Arg Pro Tyr Thr Asn Lys Val 

885 890 895 

lie Thr Leu Trp Tyr Arg Pro Pro Glu Leu Leu Leu Gly Glu Glu Arg 

900 905 910 

Tyr Thr Pro Ala Ilo Asp Val Trp Sor Cys Gly Cys Ilo Lou Gly Glu 
915 920 925 
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Leu Phe Thr Lys Lys Pro lie Phe Gin Ala Asn Lou Glu Lou Ala Gin 

930 935 940 

Lou Glu Lou Ilo Sor Arg Lou Cys Gly Sor Pro Cys Pro Ala Val Trp 
945 950 955 960 

Pro Asp Val Ilo Lys Lou Pro Tyr Pho Asn Thr Mot Lys Pro Lys Lys 

965 970 975 

Gin Tyr Arg Arg Arg Leu Arg Glu Glu Pho Sor Phe He Pro Sor Ala 

980 985 990 

Ala Lou Asp Leu Lou Asp His Mot Lou Thr Leu Asp Pro Sor Lys Arg 
995 1000 1005 
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Gly Arg Pho His Gly Ser Gly Gly Pro Pho Ala Mot His Pro Tyr Pro 
225 230 235 240 

Tyr Pro Cys Sor Arg Lou Ala Gly Ala Gin Cya Gin Ala Ala Gly Gly 

245 250 255 

Lou Gly Gly Gly Ala Ala His Ala Leu Arg Thr His Gly Tyr Cys Ala 

260 265 270 

Ala Tyr Glu Thr Lou Tyr Ala Ala Ala Gly Gly Gly Gly Ala Sor Pro 

275 280 285 

Asp Tyr Asn Sor Sor Glu Tyr Glu Gly Pro Lou 8or Pro Pro Lou Cys 

290 295 300 

Lou Asn Gly Asn Pho Sor Lou Lys Gin Asp Sor Sor Pro Asp His Glu 
305 310 315 320 

Lys Sor Tyr His Tyr Sor Mot His Tyr Sor Ala Lou Pro Gly Sor Arg 

325 330 335 

His Gly His Gly Lou Val Pho Gly Sor Sor Ala Val Arg Gly Gly Val 

340 345 350 

His Ser Glu Asn Leu Lou Ser Tyr Asp Met His Leu His His Asp Arg 

355 360 365 

Gly Pro Met Tyr Glu Glu Lou Asn Ala Pho Pho His Asn 
370 375 380 

<210> 33 
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<400> 33 

Met Ser Lys Lou Pro Arg Glu Lou Thr Arg Asp Leu Glu Arg Ser Lou 

15 10 15 

Pro Ala Val Ala Sor Lou Gly Ser Ser Lou Ser His Sor Gin Ser Leu 

20 25 30 

Ser Ser His Leu Leu Pro Pro Pro Glu Lys Arg Arg Ala lie Ser Asp 

35 40 45 

Val Arg Arg Thr Pho Cys Lou Pho Val Thr Pho Asp Leu Leu Pho lie 

50 55 60 

Ser Leu Leu Trp He lie Glu Leu Asn Thr Asn Thr Gly He Arg Lys 
65 70 75 80 

Asn Lou Glu Gin Glu Ho Ho Gin Tyr Asn Pho Lys Thr Ser Pho Phe 

85 90 95 

Asp He Pho Val Leu Ala Phe Pho Arg Phe Sor Gly Leu Leu Leu Gly 

100 105 X10 

Tyr Ala Val Lou Gin Lou Arg His Trp Trp Val He Ala Val Thr Thr 

115 120 125 

Lou Val Sor Sor Ala Phe Leu lie Val Lys Val He Lou Sor Glu Leu 

130 135 140 
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<210> 37 

<211> 532 

<212> PRT 

<213> Homo sapiens 



<400> 37 
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<210> 38 

<211> 534 

<212> PRT 

<213> Homo sapiens 
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<400> 38 

Mot Lys Gin Glu Gly Ser Ala Arg Arg Arg Gly Ala Asp Lys Ala Lys 

15 10 15 

Pro Pro Pro Gly Gly Gly Glu Gin Glu Pro Pro Pro Pro Pro Ala Pro 

20 .25 30 

Gin Asp Val Glu Met Lys Glu Glu Ala Ala Thr Gly Gly Gly Ser Thr 

35 40 45 

Gly Glu Ala Asp Gly Lys Thr Ala Ala Ala Ala Val Glu His Ser Gin 

50 55 60 

Arg Glu Leu Asp Thr Val Thr Leu Glu Asp He Lys Glu His Val Lys 
65 70 * 75 80 

Gin Leu Glu Lys Ala Val Ser Gly Lys Glu Pro Arg Phe Val Leu Arg 

85 90 95 

Ala Leu Arg Met Leu Pro Ser Thr Ser Arg Arg Leu Asn His Tyr Val 

100 105 HO 

Leu Tyr Lys Ala Val Gin Gly Phe Phe Thr Ser Asn Asn Ala Thr Arg 

115 120 125 

Asp Phe Leu Leu Pro Phe Leu Glu Glu Pro Met Asp Thr Glu Ala Asp 

130 135 140 

Leu Gin Phe Arg Pro Arg Thr Gly Lys Ala Ala Ser Thr Pro Leu Leu 
145 150 155 160 

Pro Glu Val Glu Ala Tyr Leu Gin Leu Leu Val Val He Phe Met Met 

165 170 175 

Asn Ser Lys Arg Tyr Lys Glu Ala Gin Lys He Ser Asp Asp Leu Met 

180 185 190 

Gin Lys He Ser Thr Gin Asn Arg Arg Ala Leu Asp Leu Val Ala Ala 
195 200 205 
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Lys Cys Tyr Tyr Tyr His Ala Arg Val Tyr Glu Phe Lou Asp Lys Lou. 

210 215 220 

Asp Val Val Arg Ser Phe Leu His Ala Arg Leu Arg Thr Ala Thr Leu 
225 230 235 240 

Arg His Asp Ala Asp Gly Gin Ala Thr Leu Leu Asn Leu Leu Leu Arg 

245 . 250 255 

Asn Tyr Leu His Tyr Ser Leu Tyr Asp Gin Ala Glu Lys Leu Val Ser 

260 265 270 

Lys Ser Val Phe Pro Glu Gin Ala Asn Asn Asn Glu Trp Ala Arg Tyr 

275 280 285 

Leu Tyr Tyr Thr Gly Arg lie Lys Ala He Gin Leu Glu Tyr Ser Glu 

290 295 300 

Ala Arg Arg Thr Met Thr Asn Ala Leu Arg Lys Ala Pro Gin His Thr 
305 310 315 320 

Ala Val Gly Phe Lys Gin Thr Val His Lys Leu Leu He Val Val Glu 

325 330 335 

Leu Leu Leu Gly Glu He Pro Asp Arg Leu Gin Phe Arg Gin Pro Ser 

340 345 350 

Leu Lys Arg Ser Leu Mat Pro Tyr Phe Leu Leu Thr Gin Ala Val Arg 

355 360 365 

Thr Gly Asn Leu Ala Lys Phe Asn Gin Val Leu Asp Gin Phe Gly Glu 

370 375 380 

Lys Phe Gin Ala Asp Gly Thr Tyr Thr Leu He He Arg Leu Arg His 
385 390 395 400 

Asn Val He Lys Thr Gly Val Arg Met He Ser Leu Ser Tyr Ser Arg 

405 410 415 

He Ser Leu Ala Asp He Ala Gin Lys Leu Gin Leu Asp Ser Pro Glu 

420 425 430 

Asp Ala Glu Phe He Val Ala Lys Ala He Arg Asp Gly Val He Glu 

435 440 445 

Ala Ser He Asn His Glu Lys Gly Tyr Val Gin Ser Lys Glu Met He 

450 455 460 

Asp He Tyr Ser Thr Arg Glu Pro Gin Leu Ala Phe His Gin Arg He 
465 470 475 480 

Ser Phe Cys Leu Asp He His Asn Met Ser Val Lys Ala Met Arg Phe 

485 490 495 

Pro Pro Lys Ser Tyr Asn Lys Asp Leu Glu Ser Ala Glu Glu Arg Arg 

500 505 510 

Glu Arg Glu Gin Gin Asp Leu Glu Phe Ala Lys Glu Met Ala Glu Asp 

515 520 525 

Asp Asp Asp Ser Phe Pro 
530 

<210> 39 

<211> 207 

<212> PRT 

<213> Homo sapiens 



<400> 39 

Met Ala Gly Pro Ala Thr Gin Ser Pro Met Lys Leu Met Ala Leu Gin 
1 5 10 15 

Leu Leu Leu Trp His Ser Ala Leu Trp Thr Val Gin Glu Ala Thr Pro 
20 25 30 
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Leu Gly Pro Ala Sor Ser Lou Pro Gin Ser Pho Leu Leu Lys Cys Leu 

35 40 45 

Glu Gin Val Arg Lys lie Gin Gly Asp Gly Ala Ala Leu Gin Glu Lys 

50 55 60 

Leu Val Ser Glu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
65 70 75 80 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 

85 90 95 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 

100 105 110 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu lieu Gin Ala Leu Glu Gly He 

115 120 125 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 

130 135 140 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
145 150 155 160 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 

165 170 175 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 

180 185 190 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
195 200 205 



<210> 40 

<211> 989 

<212> PRT 

<213> Homo sapiens 



<400> 40 

Met Lys Val Val Asn Leu Lys Gin Ala He Leu Gin Ala Trp Lys Glu 

15 10 15 

Arg Trp Ser Tyr Tyr Gin Trp Ala He Asn Met Lys Lys Phe Phe Pro 

20 25 30 

Lys Gly Ala Thr Trp Asp He Leu Asn Leu Ala Asp Ala Leu Leu Glu 

35 40 45 . 

Gin Ala Met He Gly Pro Ser Pro Asn Pro Leu He Leu Ser Tyr Leu 

50 55 60 

Lys Tyr Ala He Ser Sor Gin Met Val Ser Tyr Ser Ser Val Leu Thr 
65 70 75 80 

Ala He Ser Lys Pho Asp Asp Pho Ser Arg Asp Leu Cys Val Gin Ala 

85 90 95 

Leu Leu Asp He Mot Asp Mat Pho Cys Asp Arg Leu Ser Cys His Gly 

100 105 HO 

Lys Ala Glu Glu Cys He Gly Leu Cys Arg Ala Leu Leu Ser Ala Leu 

115 120 125 

His Trp Leu Leu Arg Cys Thr Ala Ala Ser Ala Glu Arg Leu Arg Glu 

130 135 140 

Gly Leu Gltx Ala Gly Thr Pro Ala Ala Gly Glu Lys Gin Leu Ala Met 
145 150 155 160 

Cys Leu Gin Arg Leu Glu Lys Thr Leu Ser Ser Thr Lys Asn Arg Ala 

165 170 175 

Leu Leu His He Ala Lys Leu Glu Glu Ala Ser Ser Trp Thr Ala lie 
180 185 190 
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Glu His Ser Lou Lou Lys Lou Gly Glu lie Leu Thr Asn Leu Ser Asn 

195 200 205 

Pro Gin Lou Arg Ser Gin Ala Glu Gin Cys Gly Thr Leu lie Arg Ser 

210 215 220 

Ilo Pro Thr Mat Leu Ser Val His Ala Glu Gin Met His Lys Thr Gly 
225 230 235 240 

Pho Pro Thr Val His Ala Val lie Lou Lou Glu Gly Thr Mat Asn Lou 

245 250 255 

Thr Gly Glu Thr Gin Ser Leu Val Glu Gin Leu Thr Mot Val Lys Arg 

260 265 270 

Mot Gin His lie Pro Thr Pro Leu Pho Val Lou Glu He Trp Lys Ala 

275 280 285 

Cya Phe Val Gly Leu lie Glu Ser Pro Glu Gly Thr Glu Glu Leu Lys 

290 295 300 

Trp Thr Ala Phe Thr Phe Leu Lys He Pro Gin Val Lou Val Lys Lou 
305 310 315 320 

Lys Lys Tyr Ser His Gly Asp Lys Asp Pho Thr Glu Asp Val Asn Cys 

325 330 335 

Ala Pho Glu Pho Leu Leu Lys Leu Thr Pro Leu Leu Asp Lys Ala Asp 

340 345 350 

Gin Arg Cys Asn Cys Asp Cys Thr Asn Pho Lou Leu Gin Glu Cys Gly 

355 360 ' 365 

Lys Gin Gly Lou Leu Ser Glu Ala Ser Val Asn Asn Lou Mat Ala Lys 

370 375 380 

Arg Lys Ala Asp Arg Glu His Ala Pro Gin Gin Lys Ser Gly Glu Asn 
385 390 395 400 

Ala Asn Ilo Gin Pro Asn Ilo Gin Lou Ilo Lou Arg Ala Glu Pro Thr 

405 410 415 

Val Thr Asn lie Leu Lys Thr Met Asp Ala Asp His Ser Lys Ser Pro 

420 425 430 

Glu Gly Leu Leu Gly Val Leu Gly His Mot Leu Ser Gly Lys Sor Leu 

435 440 445 

Asp Lou Leu Lou Ala Ala Ala Ala Ala Thr Gly Lys Lou Lys Sor Pho 

450 455 460 

Ala Arg Lys Phe Ilo Asn Leu Asn Glu Phe Thr Thr Tyr Gly Ser Glu 
465 470 475 480 

Glu Sor Thr Lys Pro Ala Sor Val Arg Ala Lou Lou Pho Asp He Ser 

485 490 495 

Pho Lou Mot Leu Cys His Val Ala Gin Thr Tyr Gly Sor Glu Val Ilo 

500 505 510 

Lou Ser Glu Ser Arg Thr Gly Ala Glu Val Pro Phe Pho Glu Thr Trp 

515 520 525 

Mot Gin Thr Cys Mot Pro Glu Glu Gly Lys Ilo Leu Asn Pro Asp His 

530 535 540 

Pro Cys Phe Arg Pro Asp Sor Thr Lys Val Glu Sor Lou Val Ala Lou 
545 550 555 560 

Lou Asn Asn Sor Sor Glu Mot Lys Leu Val Gin Mot Lys Trp His Glu 

565 570 575 

Ala Cys Leu Ser Ilo Ser Ala Ala He Lou Glu He Lou Asn Ala Trp 

580 585 590 

Glu Asn Gly Val Leu Ala Phe Glu Sor Ilo Gin Lys Ilo Thr Asp Asn 

595 600 605 

Ilo Lys Gly Lys Val Cys Ser Leu Ala Val Cys Ala Val Ala Trp Leu 

610 615 620 

Val Ala His Val Arg Met Leu Gly Lou Asp Glu Arg Glu Lys Sor Lou 
625 630 635 640 

Gin Met Ilo Arg Gin Lou Ala Gly Pro Lou Pho Sor Glu Asn Thr Leu 
645 650 655 
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Gin Pho Tyr Asn Glu Arg Val Val lie Met Asn Ser lie Leu Glu Arg 

660 665 670 

Met Cys Ala Asp Val Leu Gin Gin Thr Ala Thr Gin lie Lys Phe Pro 

675 680 685 

Ser Thr Gly Val Asp Thr Met Pro Tyr Trp Asn Leu Leu Pro Pro Lys 

690 695 700 

Arg Pro lie Lys Glu Val Leu Thr Asp lie Phe Ala Lys Val Leu Glu 
705 710 715 720 

Lys Gly Trp Val Asp Ser Arg Ser He His He Phe Asp Thr Leu Leu 

725 730 735 

His Met Gly Gly Val Tyr Trp Phe Cys Asn Asn Leu He Lys Glu Leu 

740 745 750 

Leu Lys Glu Thr Arg Lys Glu His Thr Leu Arg Ala Val Glu Leu Leu 

755 760 765 

Tyr Ser lie Phe Cys Leu Asp Met Gin Gin Val Thr Leu Val Leu Leu 

770 775 780 

Gly His He Leu Pro Gly Leu Leu Thr Asp Ser Ser Lys Trp Hie Ser 
785 790 795 BOO 

Leu Met Asp Pro Pro Gly Thr Ala Leu Ala Lys Leu Ala Val Trp Cys 

805 810 815 

Ala Leu Ser Ser Tyr Ser Ser His Lys Gly Gin Ala Ser Thr Arg Gin 

820 825 830 

Lys Lys Arg His Arg Glu Asp He Glu Asp Tyr He Ser Leu Phe Pro 

835 840 845 

Leu Asp Asp Val Gin Pro Ser Lys Leu Met Arg Leu Leu Ser Ser Asn 

850 855 860 

Glu Asp Asp Ala Asn He Leu Ser Ser Pro Thr Asp Arg Ser Met Ser 
865 870 875 880 

Ser Ser Leu Ser Ala Ser Gin Leu His Thr Val Asn Met Arg Asp Pro 

885 890 895 

Leu Asn Arg Val Leu Ala Asn Leu Phe Leu Leu He Ser Ser He Leu 

900 905 910 

Gly Ser Arg Thr Ala Gly Pro His Thr Gin Phe Val Gin Trp Phe Met 

915 920 925 

Glu Glu Cys Val Asp Cys Leu Glu Gin Gly Gly Arg Gly Ser Val Leu 

930 935 940 

Gin Phe Met Pro Phe Thr Thr Val Ser Glu Leu Val Lys Val Ser Ala 
945 950 955 960 

Met Ser Ser Pro Lys Val Val Leu Ala He Thr Asp Leu Ser Leu Pro 

965 970 975 

Leu Gly Arg Gin Val Ala Ala Lys Ala He Ala Ala Leu 
980 985 

<210> , 41 

<211> 490 

<212> PRT 

<213> Homo sapiens 



<400> 41 

Met Glu Gin Lys Pro Ser Lys Val Glu Cys Gly Ser Asp Pro Glu Glu 
X 5 10 15 

Asn Ser Ala Arg Ser Pro Asp Gly Lys Arg Lys Arg Lys Asn Gly Gin 
20 25 30 
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Cys Sor Leu Lys Thr Ser Met Ser Gly Tyr lie Pro Ser Tyr Leu Asp 

35 40 45 

Lys Asp Glu Gin Cys Val Val Cys Gly Asp Lys Ala Thr Gly Tyr His 

50 55 60 

Tyr Arg Cys He Thr Cys Glu Gly Cys Lys Gly Phe Phe Arg Arg Thr 
65 70 75 80 

Ilo Gin Lys Asn Leu His Pro Thr Tyr Ser Cys Lys Tyr Asp Ser Cys 

85 90 95 

Cys Val He Asp Lys He Thr Arg Asn Gin Cys Gin Leu Cys Arg Phe 

100 105 HO 

Lys Lys Cys He Ala Val Gly Mat Ala Mat Asp Leu Val Leu Asp Asp 

115 120 125 

Ser Lys Arg Val Ala Lys Arg Lys Leu He Glu Gin Asn Arg Glu Arg 

130 135 140 

Arg Arg Lys Glu Glu Mat He Arg Ser Leu Gin Gin Arg Pro Glu Pro 
145 150 155 160 

Thr Pro Glu Glu Trp Asp Leu He His He Ala Thr Glu Ala His Arg 

165 170 175 

Ser Thr Asn Ala Gin Gly Ser His Trp Lys Gin Arg Arg Lys Phe Leu 

180 185 190 

Pro Asp Asp He Gly Gin Ser Pro He Val Ser Mat Pro Asp Gly Asp 

195 200 205 

Lys Val Asp Leu Glu Ala Phe Ser Glu Phe Thr Lys lie He Thr Pro 

210 215 220 

Ala He Thr Arg Val Val Asp Phe Ala Lys Lys Leu Pro Mat Phe Ser 
225 230 235 240 

Glu Leu Pro Cys Glu Asp Gin He He Leu Leu Lys Gly Cys Cys Met 

245 250 255 

Glu He Met Ser Leu Arg Ala Ala Val Arg Tyr Asp Pro Glu Ser Asp 

260 265 270 

Thr Leu Thr Leu Ser Gly Glu Mat Ala Val Lys Arg Glu Gin Leu Lys 

275 280 285 

Asn Gly Gly Leu Gly Val Val Ser Asp Ala He Phe Glu Leu Gly Lys 

290 295 300 

Ser Leu Ser Ala Phe Asn Leu Asp Asp Thr Glu Val Ala Leu Leu Gin 
305 310 315 320 

Ala Val Leu Leu Mat Ser Thr Asp Arg Ser Gly Leu Leu Cys Val Asp 

325 330 335 

Lys He Glu Lys Ser Gin Glu Ala Tyr Leu Leu Ala Phe Glu His Tyr 

340 345 350 

Val Asn His Arg Lys His Asn He Pro His Phe Trp Pro Lys Leu Leu 

355 360 365 

Met Lys Glu Arg Glu Val Gin Ser Ser He Leu Tyr Lys Gly Ala Ala 

370 375 380 

Ala Glu Gly Arg Pro Gly Gly Ser Leu Gly Val His Pro Glu Gly Gin 
385 390 395 400 

Gin Leu Leu Gly Met His Val Val Gin Gly Pro Gin Val Arg Gin Leu 

405 410 415 

Glu Gin Gin Lou Gly Glu Ala Gly Ser Leu Gin Gly Pro Val Leu Gin 

420 425 430 

His Gin Ser Pro Lys Ser Pro Gin Gin Arg Leu Leu Glu Leu Leu His 

435 440 445 

Arg Ser Gly He Leu His Ala Arg Ala Val Cys Gly Glu Asp Asp Ser 

450 455 460 

Ser Glu Ala Asp Ser Pro Ser Ser Ser Glu Glu Glu Pro Glu Val Cys 
465 470 475 480 

Glu Asp Leu Ala Gly Asn Ala Ala Ser Pro 
485 490 
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<210> 42 

<211> 614 

<212> PRT 

<213> Homo sapiens 



<400> 42 

Met Thr Thr Lou Asp Sor Asn Asn Asn Thr Gly Gly Val Ilo Thr Tyr 

15 10 15 

Ilo Gly Ser Sox Gly Sor Sor Pro Sor Arg Thr Sor Pro Glu Sor Lou 

20 25 30 

Tyr Sor Asp Asn Sor Asn Gly Sor Pho Gin Sor Lou Thr Gin Gly Cys 

35 40 45 

Pro Thr Tyr Pho Pro Pro Sor Pro Thr Gly Sor Leu Thr Gin Asp Pro 

50 55 60 

Ala Arg Ser Phe Gly Ser lie Pro Pro Ser Lou Ser Asp Asp Gly Ser 
65 70 75 80 

Pro Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Sor Phe Tyr Asn 

85 90 95 

Gly Sor Pro Pro Gly Ser Leu Gin Val Ala Mat Glu Asp Sor Sor Arg 

100 105 110 

Val Sor Pro Sor Lys Ser Thr Ser Asn He Thr Lys Lou Asn Gly Met 

115 120 125 

Val Lou Leu Cys Lys Val Cys Gly Asp Val Ala Ser Gly Phe His Tyr 

130 135 140 

Gly Val Leu Ala Cys Glu Gly Cys Lys Gly Phe Phe Arg Arg Ser He 
145 150 155 160 

Gin Gin Asn He Gin Tyr Lys Arg Cys Leu Lys Asn Glu Asn Cys Ser 

165 170 175 

Ilo Val Arg He Asn Arg Asn Arg Cys Gin Gin Cys Arg Phe Lys Lys 

180 185 190 

Cys Leu Ser Val Gly Met Ser Arg Asp Ala Val Arg Phe Gly Arg He 

195 200 205 

Pro Lys Arg Glu Lys Gin Arg Mot Leu Ala Glu Met Gin Ser Ala Met 

210 215 220 

Asn Leu Ala Asn Asn Gin Leu Ser Ser Gin Cys Pro Lou Glu Thr Ser 
225 230 235 • 240 

Pro Thr Gin His Pro Thr Pro Gly Pro Met Gly Pro Ser Pro Pro Pro 

245 250 255 

Ala Pro Val Pro Ser Pro Leu Val Gly Phe Ser Gin Phe Pro Gin Gin 

260 265 270 

Lou Thr Pro Pro Arg Sor Pro Ser Pro Glu Pro Thr Val Glu Asp Val 

275 280 285 

Ilo Ser Gin Val Ala Arg Ala His Arg Glu Ilo Phe Thr Tyr Ala His 

290 295 300 

Asp Lys Lou Gly Ser Sor Pro Gly Asn Pho Asn Ala Asn His Ala Ser 
305 ^ 310 315 320 

Gly Ser Pro Pro Ala Thr Thr Pro His Arg Trp Glu Asn Gin Gly Cys 

325 330 335 

Pro Pro Ala Pro Asn Asp Asn Asn Thr Leu Ala Ala Gin Arg His Asn 

340 345 350 

Glu Ala Lou Asn Gly Leu Arg Gin Ala Pro Ser Sor Tyr Pro Pro Thr 
355 360 365 
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545 
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Leu Glu Thr Ser Arg 
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<210> 43 

<211> 703 

<212> PRT 

<213> Homo sapiens 



<400> 43 

Met Ala Asp Arg Arg Arg Gin Arg Ala Ser Gin Asp Thr Glu Asp Glu 

15 10 15 

Glu Ser Gly Ala Ser Gly Ser Asp Ser Gly Gly Ser Pro Leu Arg Gly 

20 25 30 

Gly Gly Ser Cys Ser Gly Ser Ala Gly Gly Gly Gly Ser Gly Ser Leu 

35 40 45 

Pro Ser Gin Arg Gly Gly Arg Thr Gly Ala Leu His Leu Arg Arg Val 

50 55 60 

Glu Ser Gly Gly Ala Lys Ser Ala Glu Glu Ser Glu Cys Glu Ser Glu 
65 "* * 70 75 80 

Asp Gly He Glu Gly Asp Ala Val Leu Ser Asp Tyr Glu Ser Ala Glu 

85 90 95 

Asp Ser Glu Gly Glu Glu Gly Glu Tyr Ser Glu Glu Glu Asn Ser Lys 
100 105 HO 
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Val Glu Leu 
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Glu Glu Lys 
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130 






l ^5 

X J 3 




Glu Arg Gin 


Ser Gly 


Asp 


«iy 




145 




XDv 






Lys Val Gly 


Lys Lys 


Giy 


tro 


Lys 


165 








Lys Asn Pro 


Ala Tyr 


lie 


fro 


Arg 


180 








Leu Arg Gly 


Gin Thr 


vvXn 


ulU 


kvXU 


195 










Arg Lys Leu 


Trp Lys 


Asp 


blu 


t»xy 


210 






xlD 




Glu Asp Glu 


Gin Ala 


Pro 


Lys 


oer 


225 




Tin 






Gly Tyr Asp 


lie Arg 

O A C 

245 


Ser 


Ala 


HXS 


Arg lie Arg 


Lys Pro 
260 


Arg 


Tyr 


Giy 


Trp Asn Gly 


Glu Arg 


Leu 


Asn 


T ira 

.Lys 


275 










Gly Thr Leu 


Pro Pro 


Arg 


■ 

Thr 


f no 


290 






OQC 

mtyo 




Gly Arg Mat 


Ser Ala 


pro 


Arg 


Ron 


305 




J1U 






Glu Gly Arg 


Ala Gly 


f no 


Arg 


rtu 


325 








Gly Arg Ser 


Gly Glu 
340 


Thr 


val 


T ira 

Lys 


Arg Leu Glu 


Gin Thr 


Ser 


vax 


Arg 


355 








A 

JOU 


Pro Val Leu 


Gly Ser 


Fro 


blU 


Lys 


<a *7 r\ 

370 






«J f S 




Jk V _ 11- Kin 

Ala Ala Ala 


Pro Asp 


Ala 


Al a 
ftlo 


Pro 






J 3rv 






x*ys uys oor 


Tyr Ser 
405 


li"/t 


Ala 


Arg 


Ala val iiys 


Lieu Ala 
420 




Glu 


Val 


Pro Ala Pro 


lr tO VAX 


Prn 


Glu 


Thr 


A IK 








440 


Gly Thr Trp 


VarXU AX a 




Val 


Son 








455 




Asp vai ax a 


Gin Leu 


Mali 


lie 


Ala 


4o5 




470 






Pro Ser Phe 


Leu Gin 
485 




axtcj 


Glu 


Hxs Met Gly 


X 1 — k ^ 1 mm 

Ala Giy 
500 


Fro 


fro 


Prn 


Gly Val Gin 


Gly Gly 


Arg 


Ala 


Lys 


515 








520 


Pro Val Pro 


Glu Pro 


Pro 


Ala 


Pro 


530 






535 




Gly Hi 8 Tyr 


Tyr Asp 


Pro 


Leu 


Gin 


545 




550 






Gly Asp Ser 


Pro Ala 


Pro 


Leu 


Pro 


565 









Asp 


Ala 


Val 


Asn 


Ser 


Ser 


Thr 


Lys 










125 








Ann 
ASp 


Thr 


Lys 


Ser 


Thr 


Val 


Thr 


Gly 






140 










ulU 


Ser 


Thr 


Glu 


Pro 


Val 


Glu 


Asn 






155 










160 


Hi « 
nxo 


Lou 


Ann 
Map 


Ann 
****** 


Asd 


Glu 


Asp 


Arg 




170 

X * V 








175 




Lys 


Glv 


Lou 


Phe 


Phe 


Glu 


His 


Asp 


1S5 








190 






Glu 


Val 


Arrr 


Pro 


Lvs 


Gly 

w X 


Arg 


Gin 










205 








Arg 


irp 


UlU 


His 


Asp 


Lvs 


Phe 


Arcr 








220 










Arg 


vin 


Glu 




He 


Ala 


Lou 


Tvr 

X 




235 










240 


Asn 


Pro 


Asp 


Asp 


He 


Lvs 
x ° 


Pro 


Ara 




AVV 










255 




oer 


MTI.KJ 


Prn 


Gin 


Ara 


Asp 


Pro 


Asn 


9fi5 










270 






0OJ. 


His 




His 


Gin 


Glv 


Leu 


Gly 










285 








119 


Aon 


ira 


Asn 


Ala 


Ala 


Glv 


Thr 








300 










Tyr 


Ser 


Arg 


Ser 


Glv 


Gly 

X 


Phe 


Lys 




315 










320 


Val 
V dX 


Glu 


Ala 


Glv 


Glv 

\y J. Jf 


Gin 


His 


Gly 




330 










335 




nis 


UlU 


lie 


Ser 


Tvr 


Arg 


Ser 


Arg 












350 






ASp 




Ser 


Pro 


Glu 


Ala 


Asp 


Ala 








365 








r*l ii 

i»XU 


IvXU 


a! a 

Ala 


Ala 


Ser 


Glu 


Pro 


Pro 








380 










Pro 


Pro 


Pro 


ASP 


Ara 
**** ^ 


Pro 


He 


Glu 






395 








400 




Thr 


Arg 


Thr 


Lvs 

X 


Val 


Gly 


Asp 


410 










415 




Pro 


Pro 


Pro 


Pro 


Glu 


Gly 


Leu 


He 


425 










430 






Thr 


Pro 


Thr 


Pro 


Pro 


Thr 


Lys 


Thr 










445 










Ser 


*Thr 


Ser 


Glv 


Leu . 


Glu 


Gin 








460 










Glu 


Gin 


Asn 


Trp 


Ser 


Pro 


Gly 


Gin 






475 










480 


Leu 


Ara 


Glv 


Met 


Pro 


Asn 


His 


He 




490 










495 




Gin 


Phe 


Asn 


Arg 


Met 


Glu 


Glu 


Met 


505 








510 






Arg 


Tyr 


Ser 


Ser 


Gin 


Arg 


Gin 


Arg 






525 








Pro 


Val 


His 


He 


Ser 


He 


Met 


Glu 








540 










Phe 


Gin 


Gly 


Pro 


He 


Tyr 


Thr 


His 






555 










560 


Pro 


Gin 


Gly 


Met 


Leu 


Val 


Gin 


Pro 




570 








575 
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Gly 


Mat 


Aan 


Pro His Pro Glv Leu His Pro 


His 


Gin 


Thr 


Pro 


Ala 






580 585 






590 






Fro 


LeU 




Asn Pro Gly Leu Tyr Pro Pro Pro 


Val 


Ser 


Mat 


Ser 


Pro 






595 

iJ ij 


600 




605 






Ala 


/21 vr 

*»Ay 






Pro Pro Gin Gin Leu Leu Ala Pro 


Thr 


Tyr 


Phe 


Ser 


610 




615 


620 








Gly 


Pro 


Glv 

VTA. J 


Val 


Met Asn Phe Gly Asn Pro Ser Tyr 


Pro 


Tyr 


Ala 


Pro 


625 




630 635 










640 


Ala 


Leu 


Pro 


Pro Pro Pro Pro Pro His Leu Tyr 


Pro 


Asn 


Thr 


Gin 


Ala 








645 650 








655 




Pro 


Ser 


Gin 


Val Tyr Gly Gly Val Thr Tyr Tyr 


Asn 


Pro 


Ala Gin 


Gin 








660 665 






670 






Gin 


Val 


Gin 


Pro Lys Pro Ser Pro Pro Arg Arg 


Thr 


Pro 


Gin 


Pro 


Val 






675 


680 




685 








Thr 


He 


Lys 


Pro Pro Pro Pro Glu Val Val Ser 


Arg 


Gly 


Ser 


Ser 






690 


695 


700 











<210> 44 
<211> 560 



<212> PRT 

<213> Homo sapiens 



<400> 44 










Phe Pro 


Lys 


Mat 


Pro 


Gin 


Thr 


Arg Ser Gin Ala Gin Ala Thr 


He 


Ser 


1 








5 10 






15 




Arg 


Lys 


Leu 


Ser 


Arg Ala Leu Asn Lys Ala Lys 


Asn 


Ser 


Ser Asp 


Ala 




20 


25 






30 




Lys 


Leu 


Glu 


Pro 


Thr Asn Val Gin Thr Val Thr 


Cys 


Ser 


Pro Arg 


Val 




35 




40 




45 






Lys 


Ala 


Leu 


Pro 


Leu Ser Pro Arg Lys Arg Leu 


Gly 


Asp 


Asp Asn 


Leu 


50 






55 


60 








Cys 


Asn 


Thr 


Pro 


His Leu Pro Pro Cys Ser Pro 


Pro 


Lys 


Gin Gly 


Lys 


65 








70- 75 








80 


Lys 


Glu 


Asn 


Gly 


Pro Pro His Ser His Thr Leu 


Lys 


Gly 


Arg Arg 


Leu 






85 90 






95 




Val 


Phe 


Asp 


Asn 


Gin Leu Thr He Lys Ser Pro 


Ser 


Lys 


Arg Glu 


Leu 






100 


105 






110 




Ala 


Lys 


Val 


His 


Gin Asn Lys He Leu Ser Ser 


Val 


Arg 


Lys Ser 


Gin 




115 




120 




125 






Glu 


He 


Thr 


Thr 


Asn Ser Glu Gin Arg Cys Pro 


Leu 


Lys 


Lys Glu 


Ser 




130 






135 


140 








Ala 


Cys 


Val 


Arg 


Leu Phe Lys Gin Glu Gly Thr 


Cys 


Tyr 


Gin Gin 


Ala 


145 




150 155 








160 


Lys 


Leu 


Val 


Leu 


Asn Thr Ala Val Pro Asp Arg 


Leu 


Pro 


Ala Arg 


Glu 








165 170 






175 




Arg 


Glu 


Met 


Asp 


Val He Arg Asn Phe Leu Arg 


Glu 


His 


He Cys 


Gly 






180 


185 






190 




Lys 


Lys 


Ala 


Gly 


Ser Leu Tyr Leu Ser Gly Ala 


Pro 


Gly 


Thr Gly 


Lys 


195 




200 




205 






Thr 


Ala 


Cys 


Leu 


Ser Arg He Leu Gin Asp Leu 


Lys 


Lys 


Glu Leu 


Lys 




210 




215 


220 






Gin 


Gly 


Phe Lys 


Thr 


He Mat Leu Asn Cys Met Ser 


Leu 


Arg 


Thr Ala 


225 








230 235 








240 
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Ala 


Val 


Phe Pro Ala lie 


AJ.a t»in 










Arg 


Pro 


Ala Gly Lys Asp 


Met Met 




260 




Ala 


Glu 


Lys Gly Pro Met 


He Val 






275 


280 


Lou 


Asp 


Ser Lys Gly Gin 


Asp Val 




290 




295 


Trp 


Leu 


Ser Asn Ser His 


Leu Val 


305 




310 




Asp 


Leu 


Thr Asp Arg He 


lieu Pro 




325 




Lys 


Pro 


Gin Leu Leu Asn 


Phe Pro 




340 




Thr 


lie 


Leu Gin Asp Arg 


Leu Asn 






355 


360 


Asp 


Asn 


Ala Ala Val Gin 


Phe Cys 


370 




375 


Gly 


Asp 


Val Arg Lys Ala 


Leu Asp 


365 




390 




Val 


Glu 


Ser Asp Val Lys 


Ser Gin 






405 




Cys 


Lys 


Ser Pro Ser Glu 


Pro Leu 


420 




His 


lie 


Ser Gin Val He 


Ser Glu 






435 


440 


Ser 


Gin 


Glu Gly Ala Gin 


Asp Ser 




450 




455 


Val 


Cys 


Ser Leu Met Leu 


Leu He 


465 


470 




Thr 


Leu 


Gly Lys Leu Tyr 


Giu Ala 






485 




Gin 


Val 


Ala Ala Val Asp 


Gin Ser 






500 




Leu 


Glu 


Ala Arg Gly He 


lieu Gly 






515 


520 


lieu 


Thr 


Lys Val Phe Phe 


Lys He 




530 




535 


Leu 


Lys 


Asp Lys Ala Leu 


He Gly 


545 




550 





<210> 45 
<211> 462 



<212> PRT 



Glu 


lie cys uin 




Gin Val Ser* 




<cz>V 




255 


Arg 


Lys Leu Glu 


Lys 


u 4 a Mat* Thr 


zoo 






270 


Leu 


vai Lieu Asp 


VvlU 


Maf AflD Gill 




• 






Leu 


Tyr Thr Leu 


fne 


pli, Ty-n Prn 




•ann 






Leu 


lie GXy Xx8 


&1 a 
Alcl 


Ken TV»r* TjAU 








320 


Arg 


T /*1 « Kl fa 

XjOU Gin ax a 


ATy 


/il ii T.xra Pun 


J JU 




335 


Pro 


Tyr inr Arg 


Asn 


Gin Tie Val 


345 








Gin 


Val Ser Arg 


Asp 








365 




Ala 


Arg Lys Val 


Ser 


fti. Val Cor 

Aia vai oor 




380 






Val 


Cys Arg Arg 


Ala 


1X8 vfXU 110 




395 




400 


Thr 


He Leu Lys 


Pro 


T ah Oar Gl 11 

Ajeu oexr IvXU 




410 




415 

** X J 


He 


Pro Lys Arg Val 


vsiy IraU llo 


425 








vai 


Asp Gly Asn Arg 


Mot* fVl t* Lqu 






445 




Phe 


Pro Leu Gin 


Gin 






460 






Arg 


Gin Leu Lys 


He 


T.vo Gin Val 
iiys uiu v<»x 


475 




480 


Tyr 


Ser Lys Val Cys 


1tvt T.vtt Gin 


490 




495 


Glu 


Cys Leu Ser 


Leu 


Ser Gly Leu 


505 




510 


Leu 


Lys Arg Asn Lys 


Glu Thr Arg 






525 




Glu 


Glu Lys Glu 


He 


Glu His Ala 




540 






Asn 


He Leu Ala 


Thr 


Gly Leu Pro 




555 




560 



<213> Homo sapiens 



<400> 45 

Met Ala Ser Asn Ser Ser Ser Cys Pro Thr Pro Gly Gly Gly His Leu 

1 5 10 15 

Asn Gly Tyr Pro Val Pro Pro Tyr Ala Phe Phe Phe Pro Pro Mat Leu 

20 25 30 

Glv Gly Leu Ser Pro Pro Gly Ala Leu Thr Thr Leu Gin His Gin Leu 
35 40 45 
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Pro 


Val Ser 


Gly Tyr Ser 


Thr 


Pro 




50 




55 




Ser 


Ser Ser 


Ser Glu Glu 


He 


Val 


65 




70 






Leu 


Pro Arg 


He Tyr Lys 


Pro 


Cys 






85 






Gly 


Tyr His 


Tyr Gly Val 


Ser 


Ala 






100 






Arg 


Arg Ser 


He Gin Lys Asn 


Met 


115 






120 


Asn 


Cys lie 


He Asn Lys 


Val 


Thr 




130 




135 




lieu 


Gin Lys 


Cys Phe Glu 


Val 


Gly 


145 




150 






Asp 


Arg Asn 


Lys Lys Lys 


Lys 


Glu 




165 






Ser 


Tyr Thr 


Leu Thr Pro 


Glu 


Val 




180 






Lys 


Ala His 


Gin Glu Thr 


Phe 


Pro 


195 






200 


Thr 


Thr Asn 


Asn Ser Ser 


Glu 


Gin 




210 




215 




Trp 


Asp Lys 


Phe Ser Glu 


Leu 


Ser 


225 




230 






Glu 


Phe Ala 


Lys Gin Leu 


Pro 


Gly 






245 






Gin 


He Thr 


Leu Leu Lys Ala 


Ala 






260 






He 


Cys Thr 


Arg Tyr Thr 


Pro 


Glu 




275 






280 


Gly 


Lou Thr 


Leu Asn Arg 


Thr 


Gin 


290 




295 




Leu 


Thr Asp 


Leu Val Phe 


Ala 


Phe 


305 




310 






Met 


Asp Asp 


Ala Glu Thr Gly 


Leu 






325 






Gly 


Asp Arg 


Gin Asp Leu Glu 


Gin 






340 






Glu 


Pro Leu 


Leu Glu Ala 


Leu 


Lys 




355 






JoU 


Ser 


Arg Pro 


His Mat Phe 


Pro 


Lys 




370 




375 




Arg 


Ser He 


Ser Ala Lys Gly 


Ala 


385 




390 






Glu 


He Pro 


Gly Ser Met 


Pro 


Pro 






405 






Ser 


Glu Gly 


Leu Asp Thr Leu 


Ser 






420 






Asp 


Gly Gly 


Gly Leu Ala Pro 


Pro 


435 






440 


Ser 


Pro Ser 


Ser Asn Arg Ser 


Ser 




450 




455 





Ser Pro Ala 


ML. 

Tnr 
ou 


lie 


viu 


TH t* 


Gin 


Pro Ser Pro 


Fro 




irro 


Prn 


Pro 


75 










80 


Phe Val Cys 


bin 


Asp 


Lys 


Sat* 


Ser 


S#vJ 








OS 




Cys Glu Gly 


Cys 


Lys 


Civ 

i*iy 


Phe 


Phe 


103 






110 






val Tyr inr 


Cys 


nl o 




Aon 


Lvs 












Arg Asn Arg 


Cys 


win 


Tyr 


Cys 


Ara 












Met Ser iiys 


bill 


oer 


vox 




Asn 












160 


Val Pro Lys 


Pro 


ulU 


cys . 


Ser 


Glu 


170 








175 




Gly Glu Leu 


Tl a 

lie 


ulU 


Lys 


Val 


Arg 


loo 






* Jr V 






Ala Leu Cys 


ulli 




i»iy 


T,\»B 
J-*Jf «» 














Arg Val Ser 


Leu 


ASp 


Tl A 

lie 


Asp 


Leu 


220 










Thr Lys Cys 


lie 


lie 


Lys 


Thr 


V «*x 


235 












Phe Thr Thr 


Leu 


Thr 


Tl A 

lie 


/via 


Asp 


250 












Cys Leu Asp 


Tl A 

lie 


Leu 


Tip 


LeU 


Arn 


265 






270 






Gin Asp rnr 


ClQu 


111X 


Phe 


Ser 


Asp 












Met Hxs Asn 


&1 n 
Ala 

300 


r2i v 




Glv 


Pro 


Ala ASn Ky±Ti 


Leu 


Lou 


Pro 


lieu 


Glu 












320 


lieu ser Aia 




WJf o 


lieu 


He 


Cvs 


jJO 












KaT\ it'll 

Fro Asp Arg 




Ann 


Met 


Leu 


Gin 








350 






Vai ijf* ▼<■"*- 




Lvs 


Aro 


Aro 


Pro 




365 








nBu i-nJVJ mv 


Lys 
380 


He 


Thr 


Agn 

A*»£* 


Leu 


Glu Arg vai 


Tl A 
llo 


TH v 


UttU 


Lys 


Met 


395 










400 


Leu He Gin 


Glu 


Met 


Leu 


Glu 


Asn 


410 








415 




Gly Gin Pro 


Gly 


Gly 


Gly 


Gly 


Arg 


425 






430 






Pro Gly Ser 


Cys 


Ser 
445 


Pro 


Ser 


Leu 


Pro Ala Thr 


His 
460 


Ser 


Pro 
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<210> 46 

<211> 1531 

<212> PPT 

<Z13> Homo sapiens 



<4Q0> 46 

Mac Glu Val Sar Pro Leo Gin Pro Val Am Glu Aan Met Gin Val W 

15 10 15 

Xya Xla Lyo I^yc Asn Glu Asp Ala Xffs Lys Arg Jdsu Ser Vial Glu Arg 

20 25 30 

Xlo fyr Gin Lys Xya Thr Gin Leu Glu His Xla Lou Leu Arg Pro Aap 

35 40 45 

Thr Tyr lie Gly Sar Val Glu Lot val Thr Gin Gin Mat Trp Val Tyr 

50 55 60 

Asp Glu Asp Val Gly Xla Asn Tyr Ac? Glu Val Xhr Pha Val fro Gly 
65 70 75 00 

Leu Tyr I<y* Xla Phe Asp Glu lie Xieu Val Am Ala Ala Asp Aan I<y» 

B5 90 9S 

Gin Arg Asp Pro Lya Mat Sox qys Xla Arg Val Xhr Xlo Asp Pro Glu 

100 105 110 

Asn Asn Leu Xla Sar tie Trp Asa Ajxl Gly Lyo Gly Xla Pro Val Val 

115 120 125 

Glu His Lys Vol Glu Lye Hot Tyr Val Pro Ala Lou Xla Phe Gly Gin 

130 135 140 

Lou Leu thr Sar Sar Aan Tyr Asp Asp Asp Glu Lys Zcr* Val Xhr Gly 
145 150 155 160 

Gly Arg Aan Gly tyr Gly Ala lys Lou Cya Asn Xlo Pha Sar Thr Lye 

165 170 175 

Pho Sar Val Glu Tar Ala Sar Arg Glu Tyr Lys Lyo Mat Phe Lya Gin 

180 105 190 

Thr Trp Mot Asp Asn Hot Gly Arg Ala Gly Glu Hat Glu Lau Lyo Pro 

195 200 205 

Pho Aan Gly Glu Asp Tyr Thr Cya lie Thr Pha Gin Pro Asp Lau Sar 

210 215 220 

lys Pha Lya Mat Gin Ser Lou Asp Lys Aap Xla Val Ala Leu Mat Val 
225 230 23S 240 

Arg Arg Ala Tyr Aap Xla Ala Gly Sar Thr lys Asp Val lya Val Pha 

245 250 255 

Lau Asa Gly Aan Lys Lau Pro Vol Ly* Gly Pha Arg Sar Tyr Val Asp 

260 265 270 

Mat Tyr Lau Lys Aap Lys Lou Asp Glu Thr Gly Aon Sar Lau Lya Val 

275 280 285 

lie His Glu Gin Val Asn His Arg ?tp Glu Val Cys Lau Thr Hat Sar 

290 295 300 

Glu Lya Gly Pha Gin Gin Xla Sar Phe Val Asn Sar Xla Ala Thr Sor 
305 310 315 320 

Lya Gly Gly Arg His Val Aap Tyr Val Ala Asp Gin Xla Val Thr Lya 

325 330 335 

Lau Val Asp Val Val Lyo Lys Lya Asn Lya Gly Gly Val Ala Val lys 

340 345 350 

Ala His Gin Val Lys Asn Si* Hat Trp Xla Phe Val Asn Ala Leu Xla 

855 360 365 

Glu Aan Pro Thr Phe Asp Sar Gin Thr lya Glu Asn tfiat Thr Leu Gin 
370 375 380 
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Pro Lya Sor Phe Gly Sor Thr Cy* Gin Leu Ser Glu Lys Pha He Lys 
385 390 395 400 

Ala Ala He Gly Cys Gly He Vol Glu Sox lie Lau Asn Trp Val lya 

405 410 415 

Phe lys Ala Gin Vfcl Gin Lou Asa lys lya Cys Sor Ala Val Lys Bis 

420 425 430 

Asn Arg Xla Lys cay lis pro lys Lot Asp Asp Ala Asa Asp Ala Gly 

435 440 445 

Gly Arg Asa Ser Thr Glu Cys Thr Leu Ho Lou Tar Glu Gly Asp Sex 

450 4S5 460 

Ala lys Thr Lsu Ala Vkl Bar Gly Leu Gly Val VaX Gly Arg Asp lys 
4*5 470 475 480 

Tyr Gly Val She Pro Lau Arg Gly Lya Xla Leu Asa Val Arg Glu Ala 

485 490 495 

Ser His Lya Gin Ua Mat Glu Asn Ala Glu Ha Asa Asn Xla Xla Lys 

500 505 510 

Xla Val Gly Lau Gin Tyr Xys lys Asa Tyr Glu Asp Glu Asp Bar lau 

515 520 525 

lys Thr Lea Arg Tyr Gly lys Ha Met tie Hat Thr Asp Gin Asp Gift 

530 535 540 

Asp Gly Sax His Ha lys Gly Leu Lau He Asa As Xla Bis His Asn 
545 550 555 560 

Tzp Pre Ser Lou Lau Arg His Arg Pha Leu Glu Glu Pha He Thr Pro 

565 570 575 

He Val lys Val Sex lys Asa lys Gin Glu Mot Ala Pha Tyr Ser Lett 

580 585 590 

Pro Glu Pha Glu Glu Trp Lys Ser Sex Sir Pro Asa His Zys Lys Trp 

595 600 605 

Lys Val Lys Tyr Tyr lys Gly Lau Gly Thr Ser Thr Ser Lys Glu Ala 

610 616 620 

Lys Gin Tyr Phe Ala Asp Mot lys Arg His Arg He Gin Pha lys Tyr 
625 630 63S 640 

Ser Gly Pro Glu Asp Asp Ala Ala Ho Ser Leu Ala Pha Sex Lys Lya 

645 650 655 

Gin Xle Asp Asp Arg lys Glu Trp Leu Thr Asa Phe Hot Glu Asp Arg 

660 665 670 

Arg Gin Arg Lys Leu Lau Gly Leu Pro Glu Asp Tyr Leu Tyr Gly Gin 

675 680 695 

Thr Thr Thr Tyr Leu Thr Tyr Aon Asp She He Asa Lys Glu Leu Xle 

690 695 700 

Lau Phe Ser Aon Ser Asp Asa Glu Arg Sax Ha Pro Ser Mat Val Asp 
705 710 715 720 

Gly Lau Lys Pro Gly Gin Arg lys Val Leu Phe Thr Cys Pha lys Arg 

725 730 735 

Asn Asp Zys Arg Glu VaJ, lys Val Ala Gin Leu Ala Gly Ser Val Ala 

740 74S 750 

Glu Met Ser Ser Tyr His His Gly Glu Mot Ser t^i Hat Mat Thr Xle 

755 760 765 

Xle Asn Lau Ala Gin Asn Phe Val Gly Ser Asa Asn Leu Asa Leu Lau 

770 775 780 

Gin Pro Xle Gly Gin Pha Gly Thr Arg Lau His Gly Gly Zys Asp Ser 
785 790 795 800 

Ala Ser Pro Arg Tyr He Phe Tar Met Leu ,Sar Bar Leu Ala Arg Lau 

605 810 815 

Leu Phe Pro Pro Lys Asp Asp His Thr Lau Lys Phe Lau Tyr Asp Asp 

620 825 630 

Asn Gin Arg Val Glu Pro Glu Trp Tyr Ho Pro Xle Xle Pro Hat Val 
835 640 845 
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Glu 


Gly 

ncc 
ODD 


lie 


Gly 


Thr 


Gly 


Trp 
860 


Ser 


Cys 


Lys 


He 


Arg 


Glu 


lie 


Val 


Asn 


Asn 


He 


Arg 


Arg 


Leu 


Met 


870 










.875 










880 


Leu 


Pro 


Met: 


Leu 


Fro 




xy* 


Lys 


Asn 


Phe 


Lys 


















895 




Leu 


Ala 


fro 


Ann 


Gin 


Tvr 


Val 


He 


Ser 


Gly 


Glu 








90s 










910 




Val 


Ser 


Thr 


Thr 


Tl« 

XiU 


Glu 


Ile 


Ser 


Glu 
925 


Leu 


Pro 


Thr 


Tyr 
935 


Lys 


Glu 


Gin 


Val 


Leu 
940 


Glu 


Pro 


Mat 


Leu 


Thr 


Pro 


Pro 


Leu 


He 


Thr 


Asp 


Tyr 


Arg 


Glu 


Tyr 


950 










955 










960 


Val 


Lys 


Phe 


Val 


Val 


Lys 


Met 


Thr 


Glu 


Glu 
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25 <400> 53 

atgaaggaga tggtaggagg ctgctgcgta tgttcggacg agaggggctg ggccgagaac 60 
ccgctggtct actgcgatgg gcacgcgtgc agcgtggccg tccacoaagc ttgctatggc 120 
atcgttcagg tgccaacggg accctggttc tgccggaaat gtgaatctca ggagcgagca 180 
gccagggtga ggtgtgagct gtgcccacac aaagacgggg cattgaagag gactgataat 240 
ggaggctggg cacacgtggt gtgtgccotc tacatocccg aggtgcaatt tgccaacgtg 
ctcacoatgg agcccatcgt gctgcagtac gtgcctcatg atcgcttcaa caagacctgt 
tacatctgcg aggagacggg ccgggagagc aaggcggcct cgggagcctg catgacotgt 
aaocgccatg gatgtcgaca agctttccao gtcacctgtg cccaaatggc aggcttgctg 
tgtgaggaag aagtgctgga ggtggacaac gtcaagtact gcggctactg caaataccao 
ttcagcaaga tgaagacatc ccggcacaga agcgggggag gcggaggagg cgctggagga 600 
ggaggtggca gcatgggggg aggtggcagt ggtttcatct ctgggaggag aagccggtca 660 
gcctcaccat ccacgcagca ggagaagcac cccacccacc acgagagggg ccagaagaag 720 
agtcgaaagg acaaagaacg ccttaagcag aagcacaaga agcggcctga gtcgcccooc 780 
agcatactca ccccgcccgt ggtcccoact gctgacaagg tatcotcctc ggcttcctct 840 
tcctcccaco acgaggccag caogcaggag acctctgaga gcagcaggga gtcaaagggg 900 
aaaaagtctt ccagccatag cctgagtcat aaagggaaga aactgagcag tgggaaaggt 960 
gtgagcagtt ttacctccgc ctcctcttct tcotcctcct cttcctcctc ctctgggggg 1020 
cccttccagc ctgcagtctc gtccctgcag agctcccctg acttctctgc attccccaag 



300 
360 
420 
480 
540 



1080 



ctggagcagc oagaggagga caagtactcc aagcccacag cccccgcccc ttcagcccct 1140 

ccttctccct cagctaccga gccccccaag gctgaccttt ttgagcagaa ggtggtcttc 1200 

tctggctttg ggcccatcat gcgcttctcc accaccacct ccagctcagg ccgggcccgg 1260 

45 gcgccctccc ctggggacta taagtctccc cacgtcacgg ggtctggggc ctcggcaggc 1320 

acccacaaac ggatgcccgc actgagtgcc accbctgtgc ctgctgatga gacccctgag 1380 

acaggcotga aggagaagaa gcacaaagco agcaagagga gccgccatgg gccaggccgt 1440 

ccoaagggca gccggaacaa ggagggcact gggggcccag ctgccccatc attgcccagt 1500 

gcccagctgg ctggctttac cgccactgct gcctcaccct tctctggagg ttccctggtc 1560 

agctccggcc tgggaggtct gtcctcccga acctttgggc cttctgggag cttgcccagc 1620 

ttgagcctgg agtccccctt actaggggca ggcatctaca ccagtaataa ggaccccatc 1680 

tcccacagtg gcgggatgct gcgggctgtc tgcagcaccc ctctctcctc cagcctcctg 1740 

gggcccccag ggacctcggc cctgccccgc ctcagccgct ccccgttcac cagcaccctc 1800 

ccctcctctt ctgcttctat ctocaccact caggtgtttt ctctggctgg ctctaccttt 1860 

agcctccctt ctacccacat ctttggaacc cccatgggtg ccgttaatcc cctcctctcc 1920 

caagctgaga gcagccacac agagccagac ctggaggact gcagcttccg gtgtcggggg 1980 
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40 



acctcccotc aggagagtct gtcttccatg tcccccatca gcagcctccc cgcactcttc 2040 

gaccagacag cctctgcacc ctgtgggggc ggccagttag acccggcggc cccagggacg 2100 

actaacatgg agcagcttct ggagaagcag ggcgacgggg aggccggcgt caacatcgtg 2160 

gagatgctga aggcgctgca cgcgctgcag aaggagaacc agcggctgca agagcagatc 2220 

" ctgagcctga cggccaaaaa ggagcggctg cagattctca acgtgcagct ctctgtgcca 2280 

ttccctgccc tgcctgctgc cotgcctgcc gccaacggcc ctgtccctgg gccctatggc 2340 

ctgcctcccc aagccgggag cagcgactcc ttgagcacca gcaagagccc tccgggaaag 2400 

agcagcctcg gcctggacaa otcgctgtcc acttcttctg aggacccaca ctcaggctgc 2460 

ccgagccgca gcagotcgtc gctgtcctta cacagcacgc ccccaccgct gcccctcatc 2520 

10 cagcagagcc otgccactct gcccctggcc ctgcctgggg cccctgcccc actcccgccc 2580 

cagccgcaga acgggttggg ccgggcacco ggggcagcgg ggctgggggc catgcccatg 2640 

gctgaggggc tgttgggggg gctggcaggc agtgggggcc tgcccctcaa tgggctcctt 2700 

ggggggttga atggggccgc tgcccccaao cccgcaagct tgagccaggc tggcggggcc 2760 

cccacgotgc agctgccagg ctgtctcaac agccttaoag agcagcagag acatctcctt 2820 

15 cagcagcaag agcagcagct ccagcaaotc cagcagctcc tggcctcccc gcagctgacc 2880 

ccggaacacc agactgttgt otacoagatg atccagcaga tccagcagaa acgggagctg 2940 

cagcgtctgc agatggctgg gggctcccag ctgcccatgg ccagcctgct ggaaggaagc 3000 

tccaccccgc tgctgtctgc gggtacccct ggcctgctgc ccacagcgtc tgctccaccc 3060 

ctgctgcccg ctggagccct agtggctccc tcgcttggca acaacacaag tctcatggcc 3120 

gcagcagctg cagctgoago agtagcagca gcaggcggac ctccagtcct cactgcccag 3180 

20 accaacocct tcctcagcct gtcgggagca gagggcagtg gcggtggccc caaaggaggg 3240 
accgctgaca aaggagcctc agccaaccag gaaaaaggct aa 

<210> 54 
25 <211> 2227 
<212> DNA 
<213> Homo sapiens 

30 

<400> 54 

gagagcccga acaggaagag ggtacagctt tgtgcaggtc acatgcccac tgcagccotc 
cagcotctgg tccccagagc ggactttgga agctgaactg cttttgttgc tggaagactt 
35 atgttataat ttacootggg tggaccaggg tcgtacaaaa gggcaacgct ccccagtccc 
cccactcccg accccggaat catgcatcgg actacacgga tcaaaatcac agagctgaac 
ccccacctca tgtgtgccct otgcgggggg tacttcatcg aogccaccac tatcgtggag 
tgcotgcatt ccttotgcaa aacctgcatc gtgogctacc tggagaccaa caaatactgc 



3282 



60 
120 
180 
240 
300 
360 



cccatgtgtg acgtgcaggt ccataaaacc cggccgotgc tgagcatcag gtctgacaaa 420 

acacttcaag acattgtcta caaattggtc cotgggcttt ttaaagatga gatgaaacgg 480 

cggcgggatt tctatgcagc gtaccccotg acggaggtcc ccaacggctc caatgaggao 540 
cgcggcgagg tcttggagca ggagaagggg gctctgagtg atgatgagat tgtcagcctc 
tocatcgaat tctacgaagg tgccagggao cgggatgaga agaagggccc cctggagaat 

ggggatgggg acaaagagaa aacaggggtg cgottcotgc gatgcccagc agccatgacc 720 

gtcatgcatc ttgccaagtt tctccgcaac aagatggatg tgcccagcaa gtacaaggtg 780 

45 „4-< — a ^^ a ^T A /rnMnhraaa aaatactaca occtcataora catcacctac 840 

900 



600 
660 



gaggttctgt acgaggacga gocactgaag gaatactaca ccctcatgga catcgcctac 
atctacccct ggcggcggaa cgggcctotc cccctcaagt accgtgtcca gccagcctgc 

aagcggctca ccctagccac ggtgcccacc ccctccgagg gcaccaacac cagcggggcg 960 

tccgagtgtg agtcagtcag cgacaaggct cccagccctg ccaccctgcc agccacctcc 1020 

tcctccctgc ccagcccagc caccccatcc catggotctc ccagttccca tgggcctcca 1080 

so gccacccacc ctacctoccc cactccccct tcgacagcca gtggggccac cacagctgcc 1140 

aacgggggta gcttgaactg cctgcagaca ccatcctcca ccagcagggg gcgcaagatg 1200 

actgtcaacg gcgctcccgt gcccccotta acttgaggcc agggaccctc tcccttcttc 1260 

cagccaagcc tctccactcc ttccactttt tctgggccct tttttccact tcttctactt 1320 

tccccagctc ttcccacctt gggggtgggg ggcgggtttt ataaataaat atatatatat 1380 

atgtacatag gaaaaaccaa atatacatac ttattttcta tggaccaacc agattaattt 1440 

55 aaatgccaca ggaaacaaao tttatgtgtg tgtgtatgtg tggaaaatgg tgttcatttt 1500 
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ttttgggggg ggtcttgtgt aatttgctgt ttttgggggt gcotggagat gaactggatg 1560 

ggccactgga gtctcaataa agctctgcac catcctcgct gtttcccaag gcaggtggtg 1620 

tgttgggggc cccttcagac ccaaagcttt aggcatgatt ccaactggct gcatatagga 1680 

gtcagttaga attgtttctt tctctccccg tttctctccc catcttggct gctgtcctgc 1740 

5 ctctgaccag tggccgcccc ccgcgttgtt gaatgtccag aaattgotaa gaacagtgcc 1800 

ttttacaaat gcagtttatc cctggttctg aggagcaagt gcagggtgga ggtggcacct 1860 

gcatcaccto otcctcttgc agtggaaaot ttgtgcaaag aatagatagt tctgcctctt 1920 

tttttttttt ttcctgtgtg tgtggccttt gcatcattta tcttgtggaa aagaagattc 1980 

aggccctgag aggtctcagc tcttggagga gggctaaggo tttagcattg tgaagcgctg 2040 

w cacccccacc aaccttaccc tcaccgggga accctcaata goaggactgg tggtggagtc 2100 

toacctgggg cctagagtgg aagtgggggt gggttaacct cacacaagca cagatcccag 2160 

actttgccag aggcaaacag ggaattccgc cgatactgac gggctccagg agtcgtcgcc 2220 

acactcg 2227 

15 < 2 1<» 55 

<211> 4283 
<212> DHA 
20 <213> Homo sapiens 



<400> 55 



60 



50 



25 ttgcgggaaa gagccaaacc ctggcgttgg ggggcccggg cggggagccc ctcccgcggt 

ccaoagcgac gcctgcccag ccctcotccc cttccggcto cggcacgggg ccccgaggcg 120 

ttcggaggcc aggcgggttt ctgtcaggcc cggggaggag gggcgggcgg ggcggccgct 180 

goctccccgg gacgggccgt accacgcgga cggggaggac ggggccaggg gactgcaggg 240 

cggctgcacc gcccgggggc ggggtgcgga gcgggccggc gggctccccg gggcggggcg 300 

ggagggcggg gcgtggW<=5 gacggaacca ccggggcggg gtgggaggta acgggacggg 360 

30 cgcgaccatg gcgcggtgag ggagcggggg tggggatcgg tccgggggag gcctgaggcc 420 

gctggcttgt gcgctgtctc cgccgccccc ctctttcgcc gccgccgccg ccgccccggg 480 

catgtcgtcc aactgcacca gcaccacggc ggtggcggtg gogccgctca gcgccagcaa 540 

gaccaagacc aagaagaago atttcgtgtg ccagaaagtg aagctattcc gggcoagcga 600 

gccgatcotG agcgtcctga tgtggggggt gaaccacacg atoaatgagc tgagcaatgt 660 

35 tcctgttcct gtcatgctaa tgccagatga cttcaaagcc tacagcaaga tcaaggtgga 720 

caatcatcto ttcaataagg agaacctgcc cagccgcttt aagtttaagg agtattgccc 780 

catggtgttc cgaaacottc gggagaggtt tggaattgat gatcaggatt accagaattc 840 

agtgacgogo agcgccccca tcaacagtga cagccagggt cggtgtggca cgcgtttcct 900 

caccacctac gaccggcgct ttgtcatcaa gactgtgtcc agcgaggacg tggcggagat 960 

gcacaaoatc ttaaagaaat accaccagtt tatagtggag tgtcatggca acacgctttt 1020 

40 gccacagttc ctgggcatgt accgcctgac cgtggatggt gtggaaacct acatggtggt 1080 

taccaggaac gtgttcagcc atcggctcao tgtgcatogc aagtatgacc tcaagggttc 1140 

tacggttgcc agagaagcga gcgacaagga gaaggccaag gacttgccaa cattcaaaga 1200 

caatgacttc ctcaatgaag ggcagaagct gcatgtggga gaggagagta aaaagaactt 1260 

cctggagaaa ctgaagcggg acgttgagtt cttggcacag ctgaagatca tggactacag 1320 

45 cctgctggtg ggcatccacg acgtggaccg ggcagagcag gaggagatgg aggtggagga 1380 

gcgggcagag gacgaggagt gtgagaatga tggggtgggt ggcaacctac tctgctcota 1440 

tggcacacct ccggacagcc ctggcaacct cctcagcttt cctcggttct ttggtcctgg 1500 

ggaattcgac ccctctgttg acgtctatgc catgaaaagc catgaaagtt cccccaagaa 1560 

ggaggtgtat ttcatggcca tcattgatat cctcacgcca tacgatacaa agaagaaagc 1620 

tgcacatgct gccaaaacgg tgaaacacgg ggcaggggcc gagatctcga ctgtgaaccc 1680 

tgagcagtac tcoaaacgct tcaacgagtt tatgtccaac atcctgacgt agttctcttc 1740 
taccttcagc cagagccaga gagotggata tggggtcggg gatcgggagt tagggagaag 
ggtgtatttg ggctagatgg gagggtggga gcagagtcgg gtttgggagg gctttagcaa 



800 
860 



tgagactgca gcctgtgaca ccgaaagaga ctttagctga agaggagggg gatgtgctgt 1920 
gtgtgcacct gctcacagga tgtaacccca ccttctgctt acccttgatt ttttctcccc 1980 
55 atttgacacc caggttaaaa aggggttccc tttttggtac cttgtaacct tttaagatac 2040 
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cttggggcta gagatgactt cgtgggttta tttgggtttt gtttctgaaa tttcattgct 2100 

ccaggtttgc tatttataat catatttcat cagcctaccc accctcccca tctttgctga 2160 

gctctcagtt cccttcaatt aaagagatac ccagtagacc cagcacaagg gtcottccag 2220 

5 aaccaagtgc tatggatgcc agattggaga ggtcagacac ctcgccctgc tgcatttgct 2280 

cttgtctgga ttaactttgt aatttatgga gtattgtgca caacttoctc cacotttccc 2340 

ttggattcaa gtgaaaactg ttgcattatt cctccatcct gtctggaata caccaggtca 2400 

acaccagaga tctcagatca gaatcagaga totcagaggg gaataagttc atcotcatgg 2460 

gatggtgagg ggcaggaaag cggotgggct cttggacacc tggttatcag agaaccctgt 2520 

gatgatcacc caagocccag gctgtcttag cccctggagt tcagaagtcc tctctgtaaa 2580 

10 gcctgcctcc cactaggtca agaggaacta gagtaccttt ggatttatca ggaccctcat 2640 

gtttaaatgg ttatttccct ttgggaaaac ttcagaaaot gatgtatcaa atgaggccct 2700 

gtgccctcga tctatttcct tcttccttot gacctcctcc caggcactct taottctagc 2760 

cgaactctta gctctgggca gatctccaag cgcctggagt gctttttagc agagacacct 2820 

15 cgttaagotc cgggatgacc ttgtaggaga tctgtctcoc t^tgcctgga gagttacagc 2880 

cagcaaggtg ccccoatctt agagtgtggt gtccaaacgt gaggtggctt cctagttaca 2940 

. tgaggatgtg atccaggaaa tooagtttgg aggcttgatg tgggttttga cctggcctca 3000 

gccttggggc tgtttttcct tgttgccccg ctctagactt ttagcagatc tgcagcccac 3060 

aggctttttt ggaaggagtg gcttcctgca ggtgttccac ctgcottcgg agcctgcoac 3120 

ccaggccctc agaaotgagc cacaggctgc totggccagg agagaaacag ctctgttgtt 3180 

20 ctgcattggg ggaggtacat tcctgcatct tctcacccca toaaccagga actggggatt 3240 

tgggatgaga tatggtcaga cttgtagata accccaaaga tgtgaagatc gcttgtgaaa 3300 

ccattttgaa tgaatagatt ggtttcctgt ggctccctcc aaacctggcc aagcccagct 3360 

tccgaagcag gaaccagcac tgtctctgtg cctgaotcac agcatatagg tcaggaaaga 3420 

atggagacgg cattcttgga attcactggg gctgctggat tggatgggaa accttctgga 3480 

25 agaggcagat gggggtcaaa ccactgcctt ggccccagga aggggccata ggteggtctg 3540 

aacaactgcc gcaagaccac tacatgactt agggaacttg aaaccaactg gctcatggag 3600 

aaaacaaatt tgacttggga aagggattat gtaggaataa tgtttggact tgatttcccc 3660 

acgtcataat gaagaatgga agtttggatc tgctcctcgt caggcgcagc atctctgaag 3720 

ottggaaagc tgtcttccag cctccaaaco tggccaagcc cagcttccga agcaggaacc 3780 

agcactgtot ctgtgcctga ctcacagcat ataggtcagg aaagaatgga gacggoattc 3840 

ttggacttca ctggggctgc tggattggat gggaaacctt ctggaagagg cagatggggg ^ onn 
tcaaaccact gccttggccc caggaagggg ccataggtag gtctgaacaa atgccgcaag 
accactacat gacttaggga acttgaaacc aactggctca tggagaaaac aaatttgact 
tgggaaaggg attatgtagg aataatgttt ggacttgatt tccccacgto ataatgaaga 

atggaagttt ggatctgcta ctcgtcagga gcagcatctc tgaagcttgg aaagctgtct 4140 

35 tocagcagcc tccgtggcct cgggttccta ccggattctc tgcatttggt ctgotgatca 4200 

tgttgccata atgtgtatgg aaagtgtaac acattcttac tggttaaaga cgactaccag 4260 

gtatctaact tgtttaacat tga 4283 

<210> 56 

40 

<211> 6140 

<212> DNA 

<213> Homo sapiens 

45 



<400> 56 

gcggccgcag cctgagccag ggccccctcc ctcgtcagga ccggggcagc aagcaggccg 60 

so ggggcaggtc cgggcaccca ccatgcgagg cgagctctgg ctcctggtgc tggtgctcag 120 

ggaggctgcc cgggcgctga gcccccagcc cggagcaggt cacgatgagg gcccaggctc 180 

tggatgggct gccaaaggga ccgtgcgggg ctggaaccgg agagcccgag agagccctgg 240 

gcatgtgtca gagccggaca ggacccagct gagccaggac ctgggtgggg gcaccctggc 300 
catggacacg ctgccagata acaggaccag ggtggtggag gacaaccaca gctattatgt 



3900 
3960 
4020 
4060 



55 gtcccgtctc tatggcccca gcgagcccca cagccgggaa ctgtgggtag atgtggccga 420 
ggccaaccgg agccaagtga agatccacac aatactctcc aacacccacc ggcaggcttc 



360 
420 
480 
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gagagtggtc ttgtcctttg atttcccttt ctacgggcat cctctgcggc agatcaccat 540 

agcaactgga ggcttcatct tcatggggga cgtgatccat cggatgctca cagctaotca 600 

gtatgtggcg cccctgatgg ccaacttcaa ccctggctac tccgacaact ccacagttgt . 660 

5 ttactttgac aatgggacag tctttgtggt, tcagtgggac cacgtttatc tccaaggctg 720 

ggaagacaag ggcagtttca ccttccaggc agctctgcac catgacggcc gcattgtctt 780 

tgcotataaa gagatcccta tgtctgtccc ggaaatcago tcctcccagc atcotgtcaa 840 

aaccggccta tcggatgcct tcatgattct caatccatcc ccggatgtgc cagaatctcg 900 

gcgaaggagc atctttgaat accaccgcat agagctggac cccagcaagg tcaccagcat 960 

gtcggccgtg gagttcaccc cattgcogac ctgcctgcag cataggagot gtgacgcctg 1020 

10 catgtcctca gacctgacct tcaactgcag ctggtgccat gtcctccaga gatgctccag 1080 

tggctttgac cgctatcgcc aggagtggat ggactatggc tgtgcacagg aggcagaggg 1140 

caggatgtga gaggacttcc aggatgagga ccacgactca gcctcccctg acacttcctt 1200 

cagcccctat gatggagacc tcaccactac ctcctcctco ctcttcatcg acagcctcaa 1260 

cacagaagat gacaccaagt tgaatcccta tgcaggagga gacggocttc agaacaacct 1320 

15 gtcccccaag acaaagggca ctcctgtgca cctgggcaco atcgtgggca tcgtgctggc 1380 

agtcctcctc gtggcggcca tcatcctggc tggaatttac atoaatggcc aocccacatc 1440 

caatgctgcg ctcttcttca tcgagcgtag acctcaccac tggccagcca tgaagtttcg 1500 

cagccaccat gaccattcca cctatgcgga ggtggagccc tcgggccatg agaaggaggg 1560 

cttoatggag gctgagcagt gotgagaaca ccaagtotcc cotttgaaga ctttgaggcc 1620 

acagaaaaga cagttaaagc aaagaagaga agtgactttt cctggcatct cccagcatgc 1680 

20 cctgggctga gatgagatgg tggtttatgg ctccagagct gotgttcgct tcgtcagcac 1740 

accccgaata ttgaagaggg ggccaaaaaa caaccacatg gattttttat aggaacaaca 1800 

acctaatctc atcctgtttt gatgcaaggg ttctottctg tgtcttgtaa ccatgaaaca 1860 

gcagaagaac taacataact aaotccattt ttgtttaagg ggcctttacc tattcctgca 1920 

cctaggotag gataacttta gagcactgac ataaaacgca aaaacaggaa tcatgccgtt 1980 

25 tgcaaaacta actctgggat taaaggggaa gcatgtaaac agotaactgt ttttgttaaa 2040 

gatttatagg aatgaggagg tttggctatt gtcacatgac agactgttag coaaggacaa 2100 

agaagttctg caaacctocc otggaccctt gctggtgtcc agatgtctgc ggttgtcagc 2160 

cccttccttt cccccgacct aaacataaaa gacaaggcaa agcccgcata attttaagac 2220 

ggttctttag gacattagtc caccatcttc ttggtttgct ggotctccga aataaagtcc 2280 

ctttccttgc tccaactcct tgtctctcaa cgtattggct atgacgcagc aagcagaatg 2340 

30 aatttggact cagttacagg otgtcaatgg tctgctctgt agcagtctca gagcatcccc 2400 

gacccactac ctggagatag ccagatagco agatgccctg ctcctggcca cctttaaagc 2460 

ccctgcatat gacacaggtt aactaaagtc aagattgggg ctgctgcatt ccaggttccc 2520 

tagactcaca agctggtcct tggccaggtg cagtggctca cgcatgtaat cccagcactt 2580 

tgggaggctg aggcaggcgg atcacctgaa gtcagaagtt tgagaccagc ctggccaaca 2640 

35 taattaaaat gtctctacta aaaataoaaa aaattagctg ggtgtggtga cgcttgcctg 2700 

tatcccagct aotcaggaag ctgagacacg agaatcactt gaacctggga ggcagaggtt 2760 

gcagtgagct cagatagtgc cactgcactc cagcotgggt gacagagcga gactccgtat 2820 

caaaaaaaaa aaaagaaagc agaacctcat ggctatagag ttggcatttt agccccagct 2880 

tctgtagctc tgaaagocta aagaaggtat tctctccatc tgttaaacac agtatagtgg 2940 

40 ctctcagccc ttggggcatg ttatcatggg agggaagtca aataagagga gagaaaagaa 3000 

ctcaaggggg aaactgcatt tttaggcttt gctctcttac cttgcccttt ctactcagaa 3060 

ccaataactt ctgcatcaaa acatgttaca gcctgcatca agggctttac cccaacctgc 3120 

agcocagcct tccctgggtg agcttgctat gcgcagccac atttaccatg tggggctccc 3180 

tattctgatg gcctgttcgg tgccgggttt actcactgcc ctgttctgat gtcagtgcct 3240 

gtacatacct ccaaaggcag gacttgcctg ataaatattt ttcctcctct gaactggatt 3300 

45 ttataggcat taaagacaag tcgggtggct agagggctcc ttgagacata cctagcaggg 3360 

aactgcaggt ggattctgtt gagaggcaaa gcacctgagt ggttgggaca caggcagctg 3420 

gcatgggagg gacttttttt gagacagggt ctcaotgtgt cgcccagggc aaggatgccc 3480 

aaagacacca ggttggagag gcacctgcca actacttgct ttccctggag cctgcatgtg 3540 

cctgtggggt ggggaggcgt aggggtctac ggctgcctga gatgggtgtg cacagtgtgt 3600 

so gaagtaoota cotccttgcc ttgctggact gtcagccagt cgcagggccg gccacaagac 3660 

ccatgtctcc atctggtcat actccatagc taccaagtta acctgctcta aactttggag 3720 

aactggatct gtccaataaa cgcttatttg gccaagcctg atggctcgtg cctgtactcc 3780 

cagcactttg ggaggctgag gtgggagggt tgattgagcc caggggtttg agaccagctt 3840 

gggcaacaac aacaaaaatg ccaggtgtgg tggggtgcac otgtagtccc agctactagg 3900 

gaggctgagc caggaggatc acttgagccc gggaggttga ggctgcagtg gggggtcata 3960 

55 atcatgccac tgtactccag cctgggtgac agagtgagac cctgtctccg aaaaaaaaaa 4020 
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ia acggaaaaag aaatgcttao attgtcaggg atcctgtaga caatcattaa 



4080 



10 



15 



20 



35 



40 



50 



ctctatgaga tgcttggttc tatttttttg ggagactttg tccaagtgtt ttggcttaag 4140 



aaatccatag gcctctcttg gtgacacatc tctagtaott tttgtcataa acaaacaggc 4200 

oatotgccgc caaatacatc cactcocoat gccactgaca tcctatgggt cagcoaggct 4260 

tgctttgact gaggccgagg catctggaac tttctctgcc tgcaggggct agcagoagag 4320 

gcttcaccgc atcaccaccc cttcctccac tcctgacatt ctttcccttc agggatccaa 4380 

aatggttggc cgagctccca gtgggaaaac gtgtgctaga gttggggagt gaga t gag tg 4440 

gtgctgtcca tggaatcagg ccacagcagg aactgcccca ctggccattt gagacacaca 4500 

caggtggtaa atgctctgct ggtgggctgt gcttccctca ttcagagagc tctgttaaag 4560 

cccactgtgt cctttagaag cttgaaagga acccaaotct ttgctgcact gtcctttttc 4620 

ttcotcaaat tcagaccctc cttccaccgg caccccccta ctccaccctc agctcttcct 4680 

tgcctggttt atcaagcaga gctgaggccc cacgtttcca actctgattg tcacttgcat 4740 

cttcacaaag gataaaccac ggagcaactg gaaaaccatc agccaagcgt tcggatgagt 4800 

ctggttattg gtccaccccc gaccagattc ccttacactt aactcaottc tttctttggc 4860 

aatgaccctc atgacatgta taaatgggta tgactaagaa gaggctgtga totaacattt 4920 

atttgctgcc attttttact ctggggagaa gcagccccaa otcatcactg ggaaagaact 4980 

ccccctgcaa accagotaaa tttgataatt taaaccccct gcoGctaaaa cttotcacag 5040 

agctggggag ttggtggcaa ctttccaagt oaaggtottg cttagaaagt ccttcactac 5100 

atggccaggt gcagtggctc acgcctgtag tcccaggtac ttgggagcct gaggcaggag 5160 

gattgcttga gctcaggagt tcaaggctgc agagagctat gatcatccca atgcatttgt 5220 

ttaaaaataa atttttaaaa tttgtgtgtt ttatcagggg tctcotgtac agtgtatctg 5280 

tgtatgtttg tgtgtgtgtt tgtatacago cttgtttaat gttttgagca ataagatatg 5340 

oacacacagg tattttgttg ctaaagagat tggacaaggt tgtagctgtg ctcaggcttc 5400 

agcttggttt gttaaattga gagataaaca atgacaagag ctgccagcca acoacactat 5460 

tcaaaaagca aagtgttoac caotaaagct aaccattcat otggttgcag gcaaggctaa 5520 

25 ggctctotct cctctagttc ctggaacaga cstcacagatt ggcatgaago actgatcagg 5580 

ggctgcaotc agactccctg gcoaagcaaa cctacaccag aagagtcagt gtcacagata 5640 

tgatgcggcc aatctctgtc tccaaaaaco tacctgaaot taatggtaga attcaaagat 5700 

ctggggactg agggcaccca gccttctaaa acacaatgta ttcatgtgtt tagtgtaaac 5760 

tctctgcatg gattctcagt gttaataata aaaggaagca ttcttttaca actcctgctg 5820 

tgtgcaaaag aaagtgcaaa ggatttggag tggcattccg aagatcacca cacatacctt 5880 

J ggttctgatg gctgctgaac tccgacttct tcgctgagac atgaotgtgg gaacagcctc 5940 

cagotatctg ctcatcagag gtgctttcct caacotootg caccacctcc aagagaaaca 6000 

gcctaaaaag aaaccccagc tgtttactta tattggtctg taaatccctg gaagtaaacc 6060 
ccatgcattt ttatatactg tctgaggaca tacaataaat ctgagaaagt ctatgctgtc 



6120 



aaaaaaaaaa aaaaaaaaaa 6140 
<210> 57 
<211> 2098 
<212> DNA 
<213> Homo sapiens 



45 <400> 57 

gcaggagcac gtggagaggc cgggtagcca cagcggcagc tccagcccgg cccggcagcg 

acatggaaga tatacaaaca aatgcggaac tgaaaagcac tcaggagcag tctgtgcccg 120 

cagaaagtgc agcggttttg aatgactaca gtttaaccaa atctcatgaa atggaaaatg 180 

tggacagtgg agaaggccca gccaatgaag atgaagacat aggagatgat tcaatgaaag 240 

tgaaagatga atacagtgaa agagatgaga atgttttaaa gtcagaaccc atgggaaatg 300 



60 



360 



cagaagagcc tgaaatccct ta cage tat t caagagaata taatgaatat gaaaacatta 

agttggagag acatgttgtc tcattcgata gtagcaggee aaccagtgga aagatgaact 420 

gcgatgtgtg tggattatcc tg cat cage t tcaatgtctt aatggttcat aagegaagee 480 

atactggtga acgcccattc cagtgtaatc agtgtggggc atcttttact cagaaaggta 540 

acctcctccg ccacattaaa ctgcacacag gggaaaaacc ttttaagtgt cacctctgca 600 

55 actatgeatg ccaaagaaga gatgegctea eggggcatet taggacacat tctgtggaga 660 
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780 
840 
900 
960 
1020 
X080 
1140 
1200 



aaccctacaa atgtgagttt tgtggaagga gttacaagoa gagaagttcc cttgaggagc 720 
acaaggagcg ctgccgtaca tttcttcaga gcactgaccc aggggacact gcaagtgcgg 
aggcaagaca catcaaagca gagatgggaa gtgaaagagc totcgtactg gacagattag 

5 caagcaatgt ggcaaaacga aaaagctcaa tgcctcagaa attcattggt gagaagcgcc 
actgctttga tgtcaactat aattcaagtt acatgtatga gaaagagagt gagotcatac 
agacccgcat gatggacoaa gccatcaata acgccatcag ctatcttggc gccgaagccc 
tgtgccactt ggtccagaca ccgcctgctc ccacctcgga gatggttcca gttatcagca 
gcatgtatcc catagccctc acccgggctg agatgtcaaa cggtgcccct caagagctgg 
aaaggaaaag catcctcctt coagagaaga gcgtgcctto tgagagaggc ctctctccca 

10 acaatagtgg ccacgactcc acggacaotg acagcaacca tgaagaacgc cagaatcaca 1260 

totatcagca aaatcacatg gtcctgtctc gggcccgcaa tgggatgcca cttctgaagg 1320 

aggttccocg ctcttacgaa ctcctcaagc ccccgcccat ctgcccaaga gactctgtca 1380 

aagtgatcga caaggaaggg gaggtgatgg atgtgtatcg gtgtgaccac tgccgcgtcc 1440 

tcttcctgga ctatgtgatg ttcacgattc acatgggctg ccacggcttc cgtgaccctt 1500 

75 tcgagtgtaa catgtgtgga gatcgaagcc atgatcggta tgaattctcg tctoacatag 1560 

ccagaggaga acacagaagc ctgctgaagt gaatatctgg tctcagggat tgctcctatg 1620 

tattcagcat cgtttctaaa aacagttgac ctcgoctaac agattgctct caaaacatac 1680 

tcagttccaa acttottttc ataccatttt tagctgtgtt cacaggggta gccagagaaa 1740 

oactgtottc cttcagaaat tattcgcagg tctagcatat tattactttt gtgaaacctt 1800 

tgttttccca tcagggactt gaattttatg gaatttaaaa gcoaaaaagg tatttggtca 1860 

20 ttatcttcta oagcagtgga atgagtggtc ocggagatgt gctatatgaa acattctttc 1920 

tgagatatat caaccacacg tggaaaagcc tttcagtcat acatgcaaat ccacaaagag 1980 

gaagagotga ccagctgacc ttgctgggaa gcctcaccct tctgccctta acaggctgaa 2040 

gggttaagat ctaatctccc taatctaaat gacagtctaa gagtaagtaa aagaacag 2098 

25 <210> 58 

<211> 2947 

<212> DMA 

30 

<213> Homo sapiens 



<400> 58 

35 atgccaattc ctoctccccc gccaccccca cotggtcctc ctccacctcc cacatttcat 60 

caggoaaaca cagagcagoc caagctgagt agagatgagc agcggggtcg aggcgccctc 120 

ttacaggaoa tttgcaaagg gaccaagctg aagaaggtga ccaacattaa tgatcggagt 180 

gatcccatcc tcgagaagcc gaaaggaagc agtggtggct atggotctgg aggagctgcc 240 

ctgcagccca agggaggtct cttccaagga ggagtgctga agcttcgacc tgtgggagcc 300 

40 aaggatggtt cagagaacct agctggtaag ccagccctga aaatccccag ttctcgagct 360 

gctgccccaa ggcctccagt atctgccgcc agcgggcgtc ctcaggatga tacagaoago 420 

agccgggcct cactcccaga actgccccgg atgcagagac cctctttacc ggacctctct 480 

cggcctaata ccaccagcag tacgggcatg aagcacagct cctctgcccc tcccccacoa 540 

cccccagggc ggcgtgccaa cgcacccccc acacotctgc ctatgcacag cagcaaagcc 600 

cccgcctaca acagagagaa acccttgcca ccgacgcctg gacaaaggct tcaccctggt 660 

45 cgagagggao otcctgctcc acccccagtc aaaccacctc cttcccctgt gaatatcaga 720 

acaggaccaa gtggccagtc tctggctcct cctcctccgc cttaccgcca goctcctggg 780 

gtccccaatg gaccctctag ccccaotaat gagtcagccc ctgagctgcc acagagacac 840 

aattctttgc ataggaagac accagggcct gtcagaggcc tagcacctcc tccacccacc 900 

tcggcctcco catotttact gagtaatagg ccacctcccc oagcccgaga ccctcccagt: 960 

so cggggagcag ctcctccacc cccaccacct gtgatccgaa atggtgccag ggatgctccc 1020 

cctcccccac caccataccg aatgcatggg tcagaacccc cgagocgagg aaagccccca 1080 

cctccaccct oaaggacgcc agctgggcca ccccctcctc ctccaccgcc cctgaggaat 1140 

ggccacagag attctatcac cactgtccgg tctttcttgg atgattttga gtcaaagtat 1200 

tccttccatc cagtagaaga ctttcctgct ccagaagaat ataaacactt tcagaggata 1260 

55 tatcccagca aaacaaaccg agctgcccgt ggagccccac otctgccacc cattctcagg 1320 

tgaagcctgg cttggtcccg ttcctcagga aaaggatgga ccttctcttc ttctcagatg 1380 
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20 



25 



30 



35 



40 



45 



50 



<400> 59 



gtcccttcca ttcccctgaa acotgcatga gagctcotaa catgtttctc caatgcaatc 1440 

aagccctaga ctccaaatgt cctcccagct cacctccatc tatgcatotc atctctggat 1500 

ttggtgatca gactctatat tgacagtagg atctcaaacc ctgcatccat ccttcctcca 1560 

gcaagccctg ctagccacat gaggaacaag tttcogtgtc ttctgccttc ctcttgggga 1620 

5 aaggtgcctt gttgtgatga attaactcac tgttagggca gggtggagaa tggtactcct 1680 

tcottatcct gtccactgtg ggggaagctt ggcaggtata ttatatttca tcatttagga 1740 

ggctggcatg accaggactt atgggtggga ggggagcatt tttagtgaag caagaaagga 1800 

gtttgccaag aagtgatotg ttttaaaggt catatttgga gaaagggcaa ggaattgggt 1860 

ctgctttatt tttgggggta ttttgttttt gttctcacct gctgccccco caccccacca 1920 

10 occcagggat aaattggata taaacactaa atactaatca gttgaactta acatttaata 1980 

aaaagaaagg gtgaaataaa ctgaagaoca ttttagaact: agtcagttct ctgcagcaaa 2040 

gggaacagga gccatttgaa ccctctggga cccctcaccc cactgcttca gggtgctagg 2100 

ctgagggatg tttttcctcc cccttacogc ccatgccott gaaagaaaag tcactttttg 2160 

tggagggcat cattcattcc tgattcacaa accccaaaaa cotctggtgg gagataggaa 2220 

• 15 gatagggcgt gggcctgggc cttaacctca atcttgtgtc tgcctcagtc ttttctgact 2280 

ggocctgaag ttgtcagtgg ctctttctgt ccttcagcoo ctggaaggtg ctccaggata 2340 

acaaagaagg gcaggttgaa gcccctcatg gaaggagctg gctttgtggg gctgcaaagg 2400 

acttttaagt cctgcctgta ctgaagttca cagcccacct gaotgagcag actottootg 2460 

ttcctttotc taccaccctt gccttcccag gactgcaogg tttaacacag cagagtacag 2520 

aagggtgaag aagtgagcag aggcttatga agatattoag atactcttct atgocaggaa 2580 

gcaoaaagac tttgttgaga tttgcctcag ttcagtagat cttccttggc agccagccat 2640 

aggttgtttc tttgtcttxtc gggtoataaa gagcacagag aaaatggagg tccccagtct 2700 

aggtaggaag ctgattggat gaggacttct ttttttocga oagcaggatg gggctcttgg 2760 

gctccacaoa ccagatgctt tggttttcta caactgttgc tatgtgtaga gggtgctcag 2820 

agcgtggcat gagagcaagg agaccatggc tactctttga aatggatggg gaaaattagc 2880 
ttaaaaattt aatcacgaga ttgcgccact gcactccagc ctgggcgaca gagccagact 
ccgtctc 

<210> 59 

<211> 784 

<212> DHA 

<213> Homo sapiens 



2940 
2947 



tttctttttt tgaaataaaa tagcctgtct 



ttgctctggc 


gatcgagggg 


60 


gaggggccgt 


catggccatg 


120 


tcgggatcca 


ggcccagatg 


180 


ggctgtacat 


cggtctggcc 


240 


agttccggct 


gaacctgtat 


300 


tgagcatggt 


ggccaacctc 


360 


tcattgccgg 


gttggacccg 


420 


gctgccccat 


ggtgaotgat 


480 


gaatgtgtga 


gtccctctgg 


540 


cccaagccat 


gctgaatgct 


600 


acatcatoga 


gaaggacaaa 


660 


gttcccagag 


cccacttttt 


720 


aaaaaaaaaa 


aaaaaaaaaa 


780 






784 



55 
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<210> 60 



40 



50 



60 



<211> 3033 
<212> DNA 
<213> Homo sapiens 

10 

<400> 60 

atactoctaa gctcctcccc cggcggcgag ccagggagaa aggatggccg gcctggcggc 

gcggttggtc ctgctagotg gggcagcggc gctggcgagc ggctcccagg gcgacogtga 120 

gccggtgtac cgcgactgcg taotgcagtg cgaagagcag aactgctotg ggggcgctct 180 

15 gaatcacttc cgctcccgcc agccaatcta catgagtota gcaggctgga octgtcggga 240 

cgactgtaag tatgagtgta tgtgggtcao cgttgggctc tacctccagg aaggtcaoaa 300 

agtgcctcag ttooatggca agtggccctt ctccoggttc ctgttctttc aagagocggc 360 

atoggccgtg gcctcgtttc tcaatggcct ggccagcctg gtgatgctct gccgctaccg 420 

caccttcgtg ccagcctcct cccccatgta ccacacctgt gtggccttcg cctgggtgtc 480 

cctcaatgca tggttotggt ccacagtctt ccacaccagg gaoactgacc tcacagagaa 540 

20 aatggactac ttctgtgcct ccactgtcat cotacactca atctacctgt gctgcgtcag 600 

gtgagcctgc ctgggtggct goaggggcaa aatogaacoo tgggggcaga aaggggtcac 660 

ccagccttcc catgggggcc ttcttcacta gtctcccaac acctacgoco cccaaccccc 720 

aacacatcag ctgtcctggg tgaggactct ggggtaggao tgggggccct ggctcotgac 780 

aaggagctgt agcacttgct gcccagctgt ggcctgtttg gtggggagag gggtagtgac 840 

25 ttcaggggcc atgcaccaat gttgggggga ggagatgctt cagggaatgc tgctctgggg 900 

atgggccacc tgccotctga gcaaccctgg acggtggggc aggaccgtgg ggctgcagca 960 

cocagctgtg gtcagtgcct tccgggctct cctgctgctc atgctgaccg tgcacgtctc 1020 

otacctgagc ctcatccgct tcgactatgg ctacaacctg gtggccaacg tggctattgg 1080 

cctggtcaac gtggtgtggt ggctggcctg gtgcctgtgg aaccagcggc ggctgcctca 1140 

cgtgcgcaag tgcgtggtgg tggtcttgct gctgcagggg ctgtccctgc tcgagctgct 1200 

30 tgacttccca ccgctcttct gggtcctgga tgcccatgcc atctggcaca tcagcaccat 1260 

ccctgtccac gtcctctttt tcagctttct ggaagatgac agcctgtacc tgctgaagga 1320 

atcagaggac aagttcaagc tggactgaag accttggagc gagtctgccc cagtggggat 1380 

cctgcccccg ccctgotggc ctcccttctc ccctcaaccc ttgagatgat tttctctttt 1440 

caacttcttg aacttggaca tgaaggatgt gggcccagaa tcatgtggcc agcccacccc 1500 

35 ctgttggccc tcaccagoot tggagtctgt tctagggaag gcctcccagc atctgggact 1560 

cgagagtggg cagcccctct: aoctootgga gctgaactgg ggtggaactg agtgtgttct 1620 

tagctctacc gggaggacag ctgcctgttt cctcccoaac agcctcctco ocacatcccc 1680 

agctgcctgg ctgggtcctg aagccctctg tctacctggg agaccaggga ccacaggcct 1740 

tagggataca gggggtcccc ttctgttacc accccccacc ctcctccagg acaccaotag 1800 

gtggtgctgg atgcttg^tc tttggccagc caaggttcac ggcgattctc cccatgggat: 1860 

cttgagggac caagotgotg ggattgggaa ggagtttcac cctgaccgtt gcoctagoca 1920 

ggttcccagg aggcctcacc atactccctt tcagggccag ggctcoagca agccoagggc 1980 

aaggatcctg tgctgctgtc tggttgagag cctgccaccg tgtgtcggga gtgtgggcca 2040 

ggctgagtgc ataggtgaca gggccgtgag catgggcctg ggtgtgtgtg agatoaggcc 2100 

taggtgcgca gtgtggagac gggtgttgtc ggggaagagg tgtggcttca aagtgtgtgt 2160 

45 gtgcaggggg tgggtgtgtt agcgt^rggtt aggggaacgt gtgtgcgcgt gctggtgggc 2220 

atgtgagatg agtgactgcc ggtgaatgtg tccacagttg agaggttgga gcaggatgag 2280 

ggaatcctgt caeca tcaat aatcacttgt ggagcgccag ctctgcccaa gacgccacct 2340 

gggcggacag ccaggagctc tccatggcca ggctgcctgt g^gcatgttc cctgtctggt 2400 

gcccctttgc ccgcctcctg caaacctcac agggtcccca cacaaoagtg ccctccagaa 2460 

gcagcccctc ggaggcagag gaaggaaaat ggggatggct ggggctctct ccatcctcct 2520 

tttctccttg ccttcgcatg gctggccttc ccctqcaaaa cctccattcc cctgctgcca 2580 

gcccctttgc catagectga ttttggggag gaggaagggg cgatttgagg gagaagggga 2640 

gaaagcttat ggctgggtct ggtttcttcc cttcccagag ggtcttactg ttccagggtg 2700 

gccccagggc aggcaggggc cacactatgc ctgcgccctg gtaaaggtga cccctgccat 2760 

ttaccagcag ccctggcatg ttcctgcccc acaggaatag aatggaggga gctccagaaa 2820 

55 ctttccatcc caaaggcagt ctccgtggtt gaagcagact ggatttttgc tctgcccctg 2880 
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accccttgtc cctctttgag ggaggggagc tatgatagga ctccaacctc agggactogg 2940 
gtggcctgcg ctagcttctt ttgatactga aaacttttaa ggtgggaggg tggcaaggga 3000 
tgtgcttaat aaatcaattc caagcctcac ctg 3033 



<210> 61 

<211> 1174 

<212> DNA 

10 

<213> Homo sapiens 



f5 <400> 61 

aagctcctcc cccggcggcg agccagggag aaaggatggo oggcctggcg gagcggttgg 60 

tootgctago tggggcagcg gcgotggcga gcggotccca gggcgaccgt gagccggtgt 120 

accgcgactg cgtactgcag tgcgaagago agaactgotc tgggggcgct ctgaatcact 180 

tcagctcccg ccagccaato tacatgagtc tagcaggctg gacctgtcgg gacgaotgta 240 

agtatgagtg tatgtgggtc acogttgggc tctacctcca ggaaggtoac aaagtgcctc 300 

20 agttccatgg oaagtggccc ttctccoggt tcctgttctt tcaagagccg gcatcggccg 360 

tggcctcgtt tctcaatggc ctggccagcc tggtgatgct otgccgctao cgcaccttcg 420 
tgccagcctc ctcccccatg taccacacct gtgtggcctt cgcctgggtg tccctcaatg 
catggttctg gtccacagtc ttccacacca gggacactga cctcacagag aaaatggact 
acttctgtgc ctccactgtc atcctacact caatotacct gtgctgcgtc aggaccgtgg 

25 ggctgcagca cccagctgtg gtcagtgcct tccgggctct cctgctgctc atgctgaccg 660 

tgcacgtctc ctacctgago ctcatccgct tcgactatgg otacaacctg gtggccaacg 720 

tggctattgg cctggtcaac gtggtgtggt ggctggcctg gtgcctgtgg aaccagcggc 780 

ggctgcctca cgtgcgcaag tgcgtggtgg tggtcttgct gctgcagggg ctgtccctgc 840 

tcgagctgct tgacttccca ccgctcttct gggtcctgga tgcccatgcc atctggcaca 900 

tcagcaccat ccctgtccac gtcctctttt tcagctttct ggaagatgac agcctgtacc 960 

tgctgaagga atcagaggac aagttcaagc tggttgaagc agaotggatt tttgctotgc 1020 

ccctgacccc ttgtccctct ttgagggagg ggagatatgc taggactcca acctcaggga 1080 

ctcgggtggc ctgcgctagc ttcttttgat actgaaaact tttaaggtgg gagggtggca 1140 

agggatgtgc ttaataaato aattccaago ctca 1174 

35 <210> 62 

<2U> 3167 

<212> DNA 

40 

<213> Homo sapiens 



480 
540 
600 



<400> 62 

45 aagctcctcc cccggcggcg agccagggag aaaggatggo cggcctggcg gcgcggttgg 60 
tcctgctagc tggggcagcg gcgctggcga gcggctccca gggcgaccgt gagccggtgt 120 



accgcgactg cgtactgcag tgcgaagago agaactgctc tgggggcgct ctgaatcact 180 

tccgctcccg ccagccaatc tacatgagtc tagcaggctg gacctgtcgg gacgaotgta 240 

50 agtatgagtg tatgtgggtc accgttgggc tctacctcca ggaaggtoac aaagtgcctc 300 

agttccatgg caagtggccc ttctcccggt: tcctgttatt tcaagagccg gcatcggccg 360 

tggcctcgtt tctcaatggc ctggccagcc tggtgatgct ctgccgotac cgcaccttcg 420 

tgccagcctc ctcccccatg taccacacct gtgtggcctt cgcctggatg agaaaactga 480 

ggcacagcaa ggctaaataa cttgcccaag gacacacagg aaatgcagag ccaggaactg 540 

aaccctggca gtctggctgt agggcttgca ttcttaatga taccactacc tcccaaatct 600 

55 gaggaaaggg tgtccctcaa tgcatggttc tggtccacag tcttccacac cagggacact 660 
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gacctcacag agaaaatgga ctacttctgt gcctccactg tcatcctaca ctcaatctac 720 

ctgtgctgcg tcaggtgagc ctgcctgggt ggctgcaggg gcaaaatcga accctggggg 780 

cagaaagggg tcacccagcc ttcccctggg ggccttcttc actagtctcc caacacctac 840 

5 gccccccaac ccccaacaca tcagctgtcc tgggtgagga ctctggggta ggactggggg 900 

coctggctcc tgacaaggag ctgtagcact tgctgcccag ctgtggcctg tttggtgggg 960 

agaggggtag tgacttcagg ggccatgcac caatgttggg gggaggagat gcttcaggga 1020 

atgctgctct ggggatgggo cacctgccct ctgagcaacc ctggacggtg gggcaggaco 1080 

gtggggotgc agcacccagc tgtggtcagt gccttccggg ctctcctgct gctcatgctg 1140 

accgtgcacg tctcctacct gagcctcatc cgcttcgact atggctacaa cctggtggcc 1200 

10 aacgtggota ttggcctggt caacgtggtg tggtggctgg cctggtgcct gtggaaccag 1260 

cggcggctgo otcacgtgcg caagtgcgtg gtggtggtct tgctgctgca ggggotgtcc 1320 

ctgctcgagc tgcttgaott cccaccgotc ttctgggtco tggatgccca tgccatctgg 1380 

cacatoagca ccatccctgt ccacgtcatc tttttcagct ttctggaaga tgaoagcctg 1440 

tacctgctga aggaatcaga ggacaagttc aagctggact gaagacottg gagcgagtct 1500 

75 gacccagtgg ggatcctgcc cccgcoctgc tggcctccct tctcccctca accottgaga 1560 

tgattttctc ttttcaactt cttgaacttg gacatgaagg atgtgggccc agaatoatgt 1620 

ggccagccca ccccctgttg gccct caeca gccttggagt ctgttctagg gaaggcctcc 1680 

cagcatctgg gactcgagag tgggcagcac ctctacctcc tggagctgaa ctggggtgga 1740 

actgagtgtg ttcttagctc taccgggagg acagctgcct gtttcotccc caccagcctc 1800 

ctccccacat ccccagctgc ctggctgggt cctgaagcco tctgtctacc tgggagacca 1860 

20 gggaccacag gccttaggga tacagggggt ccccttctgt taccaccccc caccctcctc 1920 

caggacacca ctaggtggtg ctggatgctt gttctttggc cagccaaggt tcacggcgat 1980 

totccccatg ggatottgag ggaccaagct gctgggattg ggaaggagtt tcaccctgac 2040 

cgttgcocta gccaggttcc caggaggcot caccatactc cctttcaggg ccagggotcc 2100 

agcaagccca gggcaaggat cctgtgctgo tgtctggttg agagcctgcc accgtgtgtc 2160 

25 gggagtgtgg gccaggctga gtgcataggt gacagggccg tgagcatggg cctgggtgtg 2220 

tgtgagctca ggcctaggtg cgcagtgtgg agacgggtgt tgtcggggaa gaggtgtggc 2280 

ttcaaagtgt gtgtgtgcag ggggtgggtg tgttagcgtg ggttagggga acgtgtgtgc 2340 

gcgtgctggt gggcatgtga gatgagtgac tgccggtgaa tgtgtccaca gttgagaggt 2400 

tggagcagga tgagggaatc ctgtcaccat caataatcac ttgtggagcg ccagctctgc 2460 

30 ccaagacgcc acctgggcgg acagccagga gctctccatg gccaggctgc ctgtgtgcat 2520 

gttccctgtc tggtgcccct ttgcocgcct cctgcaaacc tcacagggtc cccacacaac 2580 

agtgccctcc agaagoagco cctcggaggc agaggaagga aaatggggat ggctggggot 2640 

ctctccatcc tccttttctc cttgccttcg catggctggc cttcacctcc aaaacctcca 2700 

ttcccctgct gccagcccct ttgccatagc ctgattttgg ggaggaggaa ggggcgattt 2760 

gagggagaag gggagaaagc ttatggctgg gtctggtttc ttcccttccc agagggtctt 2820 

35 aotgttccag ggtggcccca gggcaggcag gggccacact atgcctgcgo cctggtaaag 2880 

gtgacccctg ccatttacca gcagccctgg catgttcctg ccccacagga atagaatgga 2940 

gggagctcca gaaaotttcc atccoaaagg cagtctccgt ggttgaagca gactggattt 3000 

ttgctctgcc cctgacccct tgtccctctt tgagggaggg gagctatgct aggactccaa 3060 

cctcagggac tcgggtggcc tgcgctagct tcttttgata ctgaaaactt ttaaggtggg 3120 

40 agggtggcaa gggatgtgot taataaatca attccaagcc tcacctg 3167 

<210> 63 
<211> 2733 
45 <212> DNA 

<213> Homo sapiens 



55 



162 



EP 1 365 034 A2 



<220> 

<221> misc_feature 

<222> (2694) . . (2694) 

<223> n=a, c, g or t 



10 <220> 

<2 2 1 > misc_f ea turo 

<222> <2724) . . (2724) 

15 

<223> n=a, c, g or t 



<400> 63 

agggagaaag gatggccggo ctggoggcgc ggttggtcat gctagctggg goagcggcgo 60 

20 tggcgagcgg ctcccagggc gaccgtgago cggtgtaccg cgactgcgta ctgcagtgcg 120 

aagagcagaa otgctotggg ggcgctctga atcaattccg ctccogccag ccaatctaca 180 

tgagtotagc aggctggacc tgtcgggacg actgtaagta tgagfcgfcatg tgggtcaccg 240 

ttgggctcta cctccaggaa ggtcacaaag tgcctcagtt coatggoaag tggcccttct 300 

cocggttcct gttctttcaa gagccggcat cggccgtggc ctcgttfcctc aatggcctgg 360 

25 ccagcctggt gatgotctgc ogotaccgca ccttcgtgcc agoctcotcc cccatgrtaco 420 

acaoctgtgt ggccttcgcc tgggtgtcco tcaatgcatg gttctggtcc acagtcttcc 480 

acacoaggga cactgaoota cagagaaaat ggactacttc tgtgcctcct: gtatcataca 540 

atcaatctac ctgtgctgcg tcaggaccgt ggggctgcag cacccagctg tggtcaagtg 600 

ccttccgggc tctcctgctg ctcatgctga ccgtgcacgt ctootacctg agcctcatcc 660 

gcttcgacta tggctacaac ctggtggcca acgtggctat tggcctggto aacgtggtgt 720 

ggtggctggc ctggtgcctg tggaaccagc ggcggatgcc tcacgtgcgc aagtgcgtgg 780 

tggtggtctt gctgctgcag gggotgtccc tgctcgagct gcttgacttc ccaccgctct 840 

tctgggtcct ggatgcccat gccatctgga aoatoagcac catccctgtc cacgtcctct 900 

ttttoagctt tctggaagat gacagcctgt acctgctgaa ggaatcagag gacaagttca 960 

agctggactg agaccttgga gcgaagtotg occcagtggg gatcotgcco ccgccctgct 1020 

35 ggcctccctt ctcccctcaa cccttgagat gattttctct tttcaactto ttgaacttgg 1080 

acatgaagga tgtgggccca gaatcatgtg gccagcccac cccctgttgg ccctcacaag 1140 

ccttggagtc tgttotaggg aaggcctccc agcatctggg actcgagagt gggcagcccc 1200 

tctacotoct ggactgaact ggggtggaac tgagtgtgtt cttagctcta ccgggaggac 1260 

agctgcctgt ttcctcocca ccagcctcct ccccacatco ccagotgcct ggctgggtcc 1320 

40 tgaagccctc tgtctacctg ggagaccagg gtaccacagg ccttagggat acagggggtc 1380 

cccttctgtt accacccccc accctcctcc aggacaccac taggtggtgc tggatgcttg 1440 

ttctttggcc agccaaggtt cacggcgatt otccccatgg gatcttgagg gaccaagctg 1500 

ctgggattgg gaaggagttt caccctgacc gttgccotag ccaggttocc aggaggcctc 1560 

accatactcc ctttcagggc cagggctcca gcaagcccag ggcaaggatc ctgtgctgct 1620 

gtctggttga gagcctgcca ocgtgtgtcg ggagtgtggg ccaggctgag tgcataggtg 1680 

45 acagggccgt gagcatgggc ctgggtgtgt gtgagctcag gcctaggtgc gcagtgtgga 1740 

gacgggtgtt gtcggggaag aggtgtggct tcaaagtgtg tgtgtgcagg gggtgggtgt 1800 

gttagcgtgg gttaggggaa cgtgtgtgcg cgtgctggtg ggcatgtgag atgagtgact 1860 

gccggtgaat gtgtccacag ttgagaggtt ggagcaggat gagggaatcc tgtcaccatc 1920 

aataatcact tgtggagcgc cagctctgcc caagaogoca cctgggcgga cagccaggag 1980 

so ctctccatgg ccaggctgcc tgtgtgcatg ttccctgtct ggtgcccctt tgcccgcctc 2040 

ctgcaaacot cacagggtcc ccacacaaca gtgccotcca gaagcagccc ctcggaggca 2100 

gaggaaggaa aatggggatg gctggggctc tctccatcot oottttctcc ttgccttcgc 2160 

atggctggcc ttcccctcca aaacctccat tcccctgctg ccagcccctt tgocatagcc 2220 

tgattttggg gaggaggaag gggcgatttg agggagaagg ggagaaagct tatggctggg 2280 

tctggtttct tcccttccca gagggtctta ctgttccagg gtggccccag gcagcagggc 2340 

cacactatgc ctgcgccctg gtaaaggtga cccotgccat ttaccagcag ccctggcatg 2400. 
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ttcctgcccc acaggaatag aatggaggga gctccagaaa ctttccatcc caaaggcagt 2460 

ctccgtggtt gaagcagaot ggatttttgc tctgcccctg accccttgtc cctcttt^ag 2520 

ggaggggagc tatgctagga otccaacctc agggactogg gtggcctgcg ctagcttctt 2580 

5 ttgatactga aaacttttaa ggtgggaggg tggcaaggga tgtgcttaag oggccgcgaa 2640 

ttcaaaaagc ttctcgagag tacttctaga gcggccgcgg gcccatcgat tttnccaocc 2700 

gggtggggta cccaggtaag tgtnccccat ate 2733 

<210> 64 

10 <211> 2546 

<212> DNA 

<213> Homo sapiens 

15 



30 



<400> 64 

aagotcotcc cccggcggcg agecagggag aaaggatggc cggootggcg gcgcggttgg 60 

tcctgctagc tggggcagcg gcgctggcga gcggctccoa gggcgaccgt gagccggtgt 120 

20 accgcgactg cgtactgcag tgegaagage agaactgetc tgggggeget otgaatcact 180 

tccgctcccg ccagocaatc fcacatgagtc tagcaggotg gaectgtegg gacgactgta 240 

agtatgagtg tatgtgggtc accgttgggc tctacctcca ggaaggtcac aaagtgcctc 300 

agttocatgg caagtggccc ttctcccggt tcctgttctt teaagagecg geatoggecg 360 

tggcctcgtt tctoaatggc ctggccagcc tggtgatgot otgccgctac cgcaccttcg 420 

25 tgccagcctc ctcccccatg taccacacct gtgtggcctt cgcctgggtg tccotcaatg 480 

catggttctg gtccaoagto ttcca caeca gggaeactga cctcacagag aaaatggact 540 

acttatgtgc ctccactgtc atcotacact caatctacct gtgctgcgtc aggcctggtc 600 

aacgtggtgt ggtggctggc ctggtgcctg tggaaccagc ggcggctgcc tcacgtgcgc 660 

aagtgegtgg tggtggtctt gotgetgeag gggctgtcce tgctcgagot gcttgacttc 720 

ceaccgctet tctgggtcct ggatgeccat gccatctggc acatcagcac catccotgtc 780 

cacgtcctct ttttcagctt tctggaagat gaeagectgt acctgetgaa ggaatcagag 840 

gacaagttca agctggactg aagaccttgg agegagtctg ccccagtggg gatcctgcco 900 

ccgccctgct ggcctccctt ctcccctcaa cccttgagat gattttctct tttcaacttc 960 

ttgaaottgg acatgaagga tgtgggccca gaatcatgtg gccagccoac cccctgttgg 1020 

occtoaccag ccttggagtc tgttctaggg aaggcctcoc agcatcfcggg actcgagagt 1080 

35 gggcagccco tctacctcct ggagctgaac tggggtggaa ctgagtgtgrt tcttagctot 1140 

acegggagga cagctgcctg tttcctaccc accagcctoa tccccacatc cccagctgcc 1200 

tggotgggtc ctgaagccct ctgtctacot gggagaccag ggaccacagg ccttagggat 1260 

acagggggto cccttctgtt accacccccc accctcctcc aggacaccac taggtggtgc 1320 

tggatgottg ttctttggco agecaaggtt caeggegatt ctccooatgg gatcttgagg 1380 

gaccaagctg ctgggattgg gaaggagttt caccctgacc gttgecctag ccaggttccc 1440 

aggaggoctc accataotcc ctttcagggc cagggctcca gcaagcccag ggcaaggatc 1500- 

ctgtgctgct gtctggttga gagcctgcca ccgtgtgtcg ggagtgtggg ccaggctgag 1560 

tgcataggtg acagggcegt gagcatgggc ctgggtgtgt gtgagctcag gcctaggtgc 1620 

gcagtgtgga gacgggtgtt gtcggggaag aggtgtggct tcaaagtgtg tgtgtgcagg 1680 

gggtgggtgt gttagcgtgg gttaggggaa cgtgtgtgcg cgtgctggtg ggcatgtgag 1740 

45 atgagtgact gccggtgaat gtgtccacag ttgagaggtt ggagcaggat gagggaatcc 1800 

tgtcaccatc aataatcact tgtggagcgc cagctctgcc caagacgcca cctgggcgga 1860 

cagecaggag ctctccatgg ccaggctgcc tgtgtgcatg ttccctgtct ggtgcccctt 1920 

tgcccgoctc ctgcaaacct cacagggtcc ccacacaaca gtgccctcca gaagcagccc 1960 

cteggaggea gaggaaggaa aatggggatg gctggggctc tctccatcct ccttttctcc 2040 

ttgecttego atggctggcc ttcccctcca aaacctccat tcccctgctg ccagcccctt 2100 

tgccatagcc tgattttggg gaggaggaag gggcgatttg agggagaagg ggagaaagct 2160 

tatggctggg tctggtttct tcccttccca gagggtctta ctgttccagg gtggcoccag 2220 

ggcaggcagg ggccacacta tgcctgcgcc ctggtaaagg tgacccctgc catttaccag 2280 

cagccctggc atgttcctgc cccacaggaa tagaatggag ggagctccag aaactttcca 2340 

tcccaaaggc agtctccgtg gttgaagcag actggatttt tgctctgccc ctgacccctt 2400 

gtccctcttt gagggagggg agetatgeta ggactccaac ctcagggact cgggtggcct 2460 
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gcgctagctt cttttgatac tgaaaacttt taaggtggga gggtggcaag ggatgtgctt 2520 
aataaatcaa ttccaagcct cacctg 2546 

<210> 65 

<211> 2683 

<212> DKA 

<213> Homo sapiens 



<400> 65 

aagctcctcc cccggcggcg agccagggag aaaggatggc cggcctggcg gcgcggttgg 60 

tcctgctagc tggggcagcg gcgctggcga gcggctcoca gggcgaccgt gagccggtgt 120 

accgcgactg cgtactgcag tgcgaagagc agaactgctc tgggggogot ctgaatcact 180 

tccgctcccg ccagccaatc tacatgagtc tagcaggctg gacctgtcgg gacgactgta 240 

agtatgagtg tatgtgggtc accgttgggc tctacctcca ggaaggtcac aaagtgcctc 300 

agttccatgg caagtggccc ttctcccggt tcctgttctt toaagagccg gcatcggacg 360 

20 tggcctcgtt tctcaatggo ctggccagcc tggtgatgct ctgacgctao cgcaccttcg 420 

tgccagcotc ctcccccatg taccacacct gtgtggcctt cgcctgggtg tccctcaatg 480 

catggttctg gtccacagtc ttccaoacca gggacactga cctcacagag aaaatggact 540 

aottotgtgc ctecactgtc atcctacact caatctacct gtgctgcgto aggaacgtgg 600 

ggctgcagca cccagctgtg gtcagtgcct tccgggctot cctgctgctc atgctgaccg 660 

tgcacgtctc ctacctgagc ctcatccgot tcgactatgg ctacaacctg gtggccaacg 720 

tggctattgg cctggtcaac gtggtgtggt ggctggcotg gtgcctgtgg aaccagcggc 780 

ggctgcctca cgtgcgcaag tgcgtggtgg tggtcttgct gctgcagggg otgtccctgc 840 

tcgagctgct tgacttccca ccgctcttct gggtcctgga tgcccatgcc atctggcaca 900 

tcagcaccat ccctgtccac gtcctctttt tcagctttct ggaagatgac agcctgtacc 960 

tgctgaagga atcagaggac aagttcaagc tggactgaag accttggagc gagtctgcoc 1020 

cagtggggat cctgcccccg coctgctggc ctcccttctc ocotcaaccc ttgagatgat 1080 

tttotctttt caacttottg aacttggaca tgaaggatgt gggcccagaa tcatgtggcc 1140 

agcccacccc ctgttggccc tcaccagcct tggagtctgt tctagggaag gcctcccagc 1200 

atotgggact cgagagtggg cagcccctct acctcctgga gctgaactgg ggtggaactg 1260 

agtgtgttct tagatotacc gggaggacag otgcotgttt cctcaccacc agcctcctoo 1320 

35 ccacatcocc agctgcctgg ctgggtcctg aagccctctg tctacctggg agaccaggga 1380 

ccacaggcct tagggataca gggggtccco ttctgttacc accccccacc ctcctccagg 1440 

acaccactag gtggtgctgg atgottgtto tttggccagc caaggttcac ggcgattctc 1500 

cccatgggat cttgagggac caagctgctg ggattgggaa ggagtttcac cctgaccgtt 1560 

gccctagcca ggttcccagg aggcctcacc ataotccctt tcagggccag ggctccagca 1620 

agcccagggc aaggatcctg tgctgctgtc tggttgagag cctgccaccg tgtgtcggga 1680 

gtgtgggcca ggctgagtgc ataggtgaca gggccgtgag catgggcctg ggtgtgtgtg 1740 

agctcaggcc taggtgogca gtgtggagac gggtgttgtc ggggaagagg tgtggcttca 1800 

aagtgtgtgt gtgcaggggg tgggtgtgtt agcgtgggtt aggggaacgt gtgtgcgcgt 1860 

gctggtgggc atgtgagatg agtgactgcc ggtgaatgtg tccacagttg agaggttgga 1920 

gcaggatgag ggaatcctgt caeca tcaat aatcacttgt ggagcgccag ctctgcccaa 1980 

gacgccacct gggcggacag ccaggagctc tccatggcca ggctgcctgt gtgcatgttc 2040 

cctgtctggt gcccctttgc ccgcctcctg caaacctcac agggtcccca cacaacagrtg 2100 

occtccagaa gcagcccctc ggaggcagag gaaggaaaat ggggatggct ggggctctct 2160 

ccatcctcct tttctccttg ccttcgcatg gctggccttc ccctccaaaa cctccattcc 2220 

cctgctgcca gcccctttgc catagectga ttttggggag gaggaagggg cgatttgagg 2280 

gagaagggga gaaagcttat ggctgggtct ggtttcttco cttcccagag ggtcttactg 2340 

ttccagggtg gccccagggc aggcaggggc cacactatgc ctgcgccctg gtaaaggtga 2400 

cccctgccat ttaccagcag ccctggcatg ttcctgcccc acaggaatag aatggaggga 2460 

gctccagaaa ctttccatcc caaaggcagt ctccgtggtt gaagcagact ggatttttgc 2520 

tctgcccctg acccctbgtc cctctttgag ggaggggagc tatgetagga ctccaacctc 2580 

agggactegg gtggcctgcg ctagcttctt ttgatactga aaacttttaa ggtgggaggg 2640 

55 tggcaaggga tgtgcttaat aaatcaattc caagcctcac ctg 2683 
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<400> 66 

aagctcctcc cccggcggcg agccagggag aaaggatggc cggcctggcg gcgcggttgg 60 

tcctgctagc tggggcagcg gcgctggcga gcggctocca gggcgaccgt gagccggtgt 120 

15 . accgcgactg cgtactgcag tgcgaagagc agaactgctc tgggggcgct ctgaatcact 180 

tccgctcccg coagccaato tacatgagtc tagcaggotg gacctgtcgg gacgactgta 240 

agtatgagtg tatgtgggrtc accgttgggc tctacctcca ggaaggtcac aaagtgcatc 300 

agttocatgg caagtggccc ttctcccggt tcotgttctt tcaagagccg gcatcggccg 360 

tggcctcgtt tctcaatgga ctggccagcc tggtgatgct ctgccgctac cgcaccttcg 420 

tgccagcctc ctcccccatg taccacacct gtgtggcctt cgcctgggtg tccctcaatg 480 

20 catggttctg gtccacagtc ttccacacca gggacactga cctcacagag aaaatggact 540 

acttctgtgc ctccactgtc atcctacact caatctacct gtgctgogtc agctttctgg 600 

aagatgacag cctgtacctg ctgaaggaat cagaggacaa gttcaagctg gactgaagac 660 

cttggagcga gtctgcccca gtggggatcc tgcccccgcc ctgctggcct cccttctccc 720 

25 ctcaaccctt gagatgattt tctcttttca aottcttgaa cttggacatg aaggatgtgg 780 

goccagaatc atgtggccag cccaccccct gttggcccto accagccttg gagtctgttc 840 

tagggaaggc ctcccagcat ctgggaotcg agagtgggca gcccctctac ctcctggagc 900 

tgaaotgggg tggaactgag tgtgttctta gctctaccgg gag ga cage t gcctgfcttcc 960 

tccccaccag cctcctcccc acatccccag ctgcctggct gggtcctgaa gccctctgtc 1020 

tacctgggag accagggacc acaggectta gggataoagg gggtcccctt ctgttaccac 1080 

30 cccccaccct cctccaggac accaotaggt ggtgctggat gcttgttctt tggccagcca 1140 

aggttcaegg cgattctccc catgggatct tgagggacca agctgctggg attgggaagg 1200 

agtttcaccc tgaccgttgc cctagccagg ttcccaggag gcctcaccat actcaotttc 1260 

agggecaggg ctccagcaag cccagggcaa ggatoctgtg ctgctgtctg gttgagagee 1320 

tgccaccgtg tgtcgggagt gtgggccagg ctgagtgcat aggtgacagg geegtgagea 1380 

35 tgggcotggg tgtgtgtgag ctcaggccta ggtgcgcagt gtggagacgg gtgttgtcgg 1440 

ggaagaggtg tggcttcaaa gtgtgtgtgt gcagggggtg ggtgtgttag cgtgggttag 1500 

gggaacgtgt gtgcgcgtgc tggtgggcat gtgagatgag tgactgeegg tgaatgtgtc - 1560 

cacagttgag aggttggagc aggatgaggg aatcctgtca ccatcaataa tcacttgtgg 1620 

agcgccagct ctgcccaaga cgccacatgg gcggacagcc aggagctctc catggecagg 1680 

4Q ctgcctgtgt gcatgttccc tgtctggtgc ccctttgccc gcctcotgca aacctcacag 1740 

ggtccccaca caacagtgcc ctccagaagc agcccctcgg aggcagagga aggaaaatgg 1800 

ggatggctgg ggqtctctcc atcctccttt tctccttgcc ttcgcatggc tggccttccc 1860 

ctccaaaacc tccattcccc tgctgccagc ccctttgcca tagectgatt ttggggagga 1920 

ggaaggggcg atttgaggga gaaggggaga aagcttatgg ctgggtctgg tttcttccct 1980 

tcccagaggg tcttactgtt ccagggtggc cccagggcag geaggggeca cactatgcct 2040 

45 gcgccctggt aaaggtgacc cctgccattt accagcagcc ctggcatgtt cctgcccoac 2100 

aggaatagaa tggagggagc tccagaaact ttccatccca aaggcagtct ccgtggttga 2160 

agcagactgg atttttgetc tgcccctgac cccttgtccc tctttgaggg aggggagcta 2220 

tgotaggact ccaacctcag ggactcgggt ggcctgcgct agcttctttt gatactgaaa 2280 

acttttaagg tgggagggtg gcaagggatg tgcttaataa atoaattcca agcctcacct 2340 

g ^ ' 2341 
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<210> 67 

<211> 2109 

<212> DNA 

<213> Homo sapiens 



<400> 67 

gattcggccg gagcfcgccag cggggaggct gcagccgcgg gttgttacag ctgctggagc 60 

agcagcggco cccgctcccg ggaaccgttc ccgggccgtt gatcttcggc cccacacgaa 120 

oagcagagag gggcagcagg atgaatgtgg gcacagcgca cagcgaggtg aaccccaaca 180 

cgcgggtgat gaaoagccgt ggoatctggc tctcctacgt gctggccatc ggtctoctcc 240 

acatcgtgct gctgagcatc ccgtttgtga gtgtcoctgt cgtctggaco otcaccaacc 300 

tcattcacaa catgggcatg tatatcttcc tgcacacggt gaaggggaca occtttgaga 360 

ccccggacca gggcaaggcg aggctgctaa ocoaotggga gcagatggat tatggggtco 420 

agttcacggc ctctcggaag ttcttgacca tcacaccoat cgtgctgtac ttcctcacca 480 

gcttotaoao taagtacgac cagatccatt ttgtgotcaa caccgtgtcc ctgatgagcg 540 

20 tgcttatccc caagctgccc cagctccacg gagtccggat ttttggaatc aataagtact 600 

gagagtgcag ccccttoccc tgcccagggt ggcaggggag gggtagggta aaaggcatgt 660 

gctgcaacac tgaagacaga aagaagaagc atctggacac tgccagagat gggggttgag 720 

cctctggcct aatttccccc ctcgcttccc ccagtagcca aottggagta gcttgtagtg 780 

gggttggggt aggccccctg ggctotgace ttttctgaat tttttgatct cttcottttg 840 

25 ctttttgaat agagaotcoa tggagttggt catggaatgg gctgggctcc tgggctgaac 900 

atggaocacg cagttgcgac aggaggcoag gggaaaaacc cctgctcact tgtttgccct 960 

caggcagcca aagcacttta acccctgcat agggagcaga gggcggtacg gcttctggat 1020 

tgtttcactg tgattcctag gttttttoga tgccatgcag tgtgtgcttt tgtgtatgga 1080 

agcaagtgtg ggatgggtot ttgcctttct gggtagggag ctgtotaatc caagtaccag 1140 

gcttttggca gcttctctgc aacccaccgt gggtcctggt tgggagtggg gagggtcagg 1200 

30 ttggggaaag atggggtaga gtgtagatgg cttggttcca gaggtgaggg ggccagggct 1260 

gotgccatcc tggcctggtg gaggttgggg agotgtagga gage tag tga gtcgagactt 1320 

agaagaatgg ggecacatag cagcagagga ctggtgtaag ggagggaggg gtagggacag 1380 

aagctagacc caatotoott tgggatgtgg gcagggaggg aagcaggctt ggagggttaa 1440 

tttacccaca gaatgtgata gtaatagggg agggaggctg ctgtgggttt aactcctggg 1500 

ttggctgttg ggtagacagg tggggaaaag gcccgtgagt cattgtaagc acaggtccaa 1560 

cttggocctg actcctgcgg gggtatgggg aagotgtgac agaaacgatg ggtgctgtgg 1620 

tcctctgcag gccotcaccc cttaacttcc teatgeagae tggcactggg cagggcctct 1680 

catgtggcag coacatgtgg cgttgtgagg ccaccccatg tggggtctgt ggtgagagto 1740 

ctgtaggatc cctgctcaag cagcacagag gaaggggcaa gacgtggcct gtaggcactg 1800 

tctcagcctg cagagaagaa agtgaggccg ggagcotgag cctgggctgg agccttctcc 1860 

4 cctccccagt tggactaggg gcagrtgttaa ttttgaaaag gtgtgggtcc ctgtgtcctt 1920 

ttccaggggt ccaagggaac aggagaggtc actgggcctg ttttctccct cctgaccctg 1980 
catctcccac cctgtgtatc atagggaact ttcaccttaa aatctttcta agcaaagtgt 
gaataggatt tttactccct ttgtacagta ttctgaggaa cgcaaataaa agggcaacat 
gtttctgtt 

45 

<210> 68 
<211> 2423 
50 <212> DNA 
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<400> 68 

gagagccgag ctagcgacga gcagtcgttg cggccgccgg cgccgcggga ggtggtggag 60 

gcctagccgg agccgagagg tctcttgttc ccgtcccacg gtcccggcgt cacccctccg 120 

gcgcccagtc cccgtcccgg aactcccggg cctgtcctgg gcccccggtc tgtgcactcc 180 

gctcgccgca gcgcccggcc cgggccgcac ccgccggccc catgaggagg gacgtgaacg 240 

gagtgaccaa gagcaggttt gagatgttct caaatagtga tgaagctgta atcaataaaa 300 

aacttcccaa agaactcctg ttacggatat tttcttttct agatgttgtt accctgtgcc 360 

gctgtgctca ggtctccagg gcctggaatg ttctggctct ggatggcagt aactggcagc 420 

gaattgacct atttgatttc cagagggata ttgagggccg agtagtggag aatatttcaa 480 

aacgatgtgg gggcttttta cgaaagttaa gtcttcgtgg atgtcttgga gtgggagaca 540 

atgcattaag aacctttgca caaaactgca ggaacattga agtactgaat ctaaatgggt 600 

gtacaaagac aacagacgct acatgtacta gcottagcaa gttctgttcc aaactcaggc 660 

accttgactt ggcttcctgt acatcaataa caaacatgtc tctaaaagct ctgagtgagg 720 

gatgtccact gttggagcag ttgaaoattt ootggtgtga ccaagtaacc aaggatggca 780 

ttcaagcact agtgaggggc tgtgggggto tcaaggcctt attcttaaaa ggctgcaogc 840 

agctagaaga tgaagctctc aagtacatag gtgcaoactg ccctgaactg gtgaatttga 900 

aottgcagac ttgcttgcaa atcacagatg aaggtctcat tactatatgo agagggtgcc 960 

ataagttaca atccctttgt gcctctggct gctccaacat cacagatgoc atcotgaatg 1020 

ctctaggtca gaactgccca cggcttagaa tattggaagt ggcaagatgt tctcaattaa 1080 

cagatgtggg ctttaccfact otagccagga attgccatga acttgaaaag atggacctgg 1140 

aagagtgtgt tcagataaca gatagcaoat taatccaact ttctatacac tgtcctcgac 1200 

ttcaagtatt gagtctgtot cactgtgagc tgatoacaga tgatggaatt cgtcaoctgg 1260 

ggaatggggo ctgcgcccat gaccagctgg aggtgattga gotggacaao tgcccactaa 1320 

tcacagatgc atccctggag cacttgaaga gctgtcatag ccttgagogg atagaaotct 1380 

atgaotgcca gcaaatcaca cgggctggaa tcaagagact caggacccat ttacccaata 1440 

ttaaagtcca cgcctacttc goacctgtca otccaccccc atcagtaggg ggcagcagac 1500 

agcgcttctg cagatgctgc a tea tec tat gaoaatggag gtggtcaacc ttggcgaact 1560 

gagtatttaa tgacacttct agagctaccg tggagtctct ocagtggaag caaccccagt 1620 

gttctgagca agggttacaa agtgagggag ggcagtgtcc agatccocag agccacacat 1680 

acatacacat acacaccctt acccccatcc actctagctt tgtgaccatg ggactgaagt 1740 

ttgtgatggc ttttttatoa agtagattgg taaaatttaa ccattcctgt tgaggtgccc 1800 

ataagaaaat cataggecaa gatagggagg ggcattccag caaaocccgt gttaatgeta 1860 

ctgtggtttt taaatttttg tctaggggtt tctttgggga ttttagaaca gcatotgatg 1920 

tcctccgggg tcaagaaaag catggaaaga caatatatga tgtacccagg gaccagaaag 1980 

aaaatttott tgcatcttag aaatggtaga cattcattgt gactaaagag ettctatget 2040 

tccttgtttc catgccaaca tgctgagcat gctoacaaag aaggctcgtc cattcctcct 2100 

gtgttttagt atttggecca gaggtttcct aaatggttgc cttgaaatca ctgtggtcca 2160 

aatgtaattc ttacacactc aaattatcac tgtctgtagc acacttgtgc acctgtctta 2220 

cattctctgt tgctcccccc cacactcttg ctcagtctgt cacctgttca gtctgettae 2280 

tcactcaatt gttacccttt tgctgttgtc gtgtttacag tttgcatttt gaatgattag 2340 

ttgggattac caaacatttt ttaaaaagat attatcaata aatatttttt taattctaaa 2400 

ttttaaaaaa aaaaaaaaaa aaa 2423 
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agotgggacc ggagggtgag cccggcagag gcagagacac aegeggagag gaggagaggc 60 

tgagggaggg aggtggagaa ggaegggaga ggcagagaga ggagacacgo agagacactc 120 

aggaggggag agacaccgag aegcagagae actcaggagg ggagagacac egagaegcag 180 

agacacccag geeggggage gegagggage gaggcacaga cctggctcag egagegeggg 240 

gggcgagccc cgagtcccga gagcctgggg gcgcgcccag cccgggcgcc gaccctcctc 300 
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ccgctcccgc gccctcccct cggcgggcac ggtattttta tccgtgcgcg aacagccctc 360 
ctcctcctct cgccgcacag cccgccgcct gcgcggggga gcccagcaca gaccgccgcc 
gggaccccga gtcgcgcacc ccagccccac cgoccacccc gcgcgccatg gaccccaagg 

5 accgcaagaa gatccagttc tcggtgcccg cgccccctag ccagctcgac ccccgccagg 
tggagatgat ccggcgcagg agaccaacgc ctgccatgct gttccggctc tcagagcact 

cctcaccaga ggaggaagco tccccccacc agagagcctc aggagagggg caeca to tea 660 

agtcgaagag acccaacccc tgtgcctaca caccaccttc gctgaaagct gtgeagegea 720 

ttgctgagtc tcacctgcag tctatcagca atttgaatga gaaccaggcc tcagaggagg 780 

aggatgagct gggggagctt egggagotgg gttatccaag agaggaagat gaggaggaag 840 

10 aggaggatga tgaagaagag gaagaagaag aggacageca ggctgaagtc ctgaaggtca 900 

tcaggcagtc tgctgggcaa aagacaacct gtggccaggg tctggaaggg ccctgggagc 960 

gcccaqcccc tctggatgag tccgagagag atggaggctc tgaggaccaa gtggaagacc 1020 

cagcactaag tgagcotggg gaggaacctc agcgcccttc cccctotgag cctggcacat 1080 

aggcacccag cctgcatctc ccaggaggaa gtggagggga catcgctgtt ccccagaaac 1140 

f5 ccaotctatc ctcaccctgt tttgtgctat tcccotcgcc tgetaggget gcggcttctg 1200 

acttctagaa gactaaggct ggtctgtgtt tgcttgtttg cccacctttg gctgataccc 1260 

agagaacctg ggcacttget gcctgatgcc cacccctgcc ag teat tec t ccattcaccc 1320 

agegggaggt gggatgtgag acagcccaca ttggaaaatc cagaaaaocg ggaacaggga 1380 

tttgcccttc acaattctac tcoccagatc ctctcccctg gacacaggag acccaoaggg 1440 

caggacccta agatctgggg aaaggaggtc ctgagaacct tgaggtaccc ttagatcctt 1500 

20 ttctacccac tttcctatgg aggattccaa gtcaccactt ctctcaccgg cttctaccag 1560 

ggtccaggac taaggcgttt ttctccatag cctcaacatt ttgggaatot tcccttaatc 1620 

acccttgctc ctcctgggtg cctggaagat ggactggcag agacctcttt gttgcgtttt 1680 

gtgetttgat gecaggaatg ccgcctagtt tatgtccccg gtggggcaca cagegggggg 1740 

egccaggttt tccttgtccc ocagctgctc tgcccctttc cccttcttcc ctgaotccag 1800 

25 gcctgaaccc ctcccgtgct gtaataaatc tttgtaaata a 1841 

<210> 70 
<211> 748 
30 <212> DHA 

<213> Homo sapiens 

35 

<400> 70 

ggccgcgatg ageggggago eggggcagae gtccgtagcg ccocctcccg aggaggtcga 60 

geegggcagt ggggtccgea tcgtggtgga gtactgtgaa ccctgcggct tegaggegae 120 

ctacctggag ctggccagtg ctgtgaagga gcagtatccg ggcatcgaga tegagtcgeg 180 

cctcgggggc acaggtgect ttgagataga gataaatgga cagctggtgt tctccaagct 240 

ggagaatggg ggatttccct atgagaaaga tctcattgag gccatccgaa gagecagtaa 300 

tggagaaacc ctagaaaaga tcaccaacag ccgtcctacc tgcgtcatcc tgtgactgca 360 

caggactctg ggttcctgct ctgttctggg gtccaaacct tggtctccct ttggtcctgc 420 

tgggagotcc ccctgcctct ttcccctact tagctcctta gcaaagagac cctggcctcc 480 

actttgccct ttgggtacaa agaaggaata gaagattccg tggccttggg ggcaggagag 540 

45 agacactctc catgaacact tctccagcca cctcataccc ccttcccagg gtaagtgccc 600 

acgaaagccc agtccactct tegcoteggt aatacctgtc tgatgecaca gattttattt 660 

attctcccct aacccagggc aatgtcagct attggcagta aagtggcgct acaaacacta 720 

748 



50 



55 
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<210> 


71 


5 


<2U> 


795 




<212> 


DNA 




<213> 


Homo 



15 



10 

<400> 71 

tacggc tgcg agaagacgac agaagctaga cccaatctcc tttgggatgt gggcagggag 60 

ggaagcaggc ttggagggtt aatttaccca cagaatgtga tagtaatagg ggagggaggc 120 

tgctgcgggt ttaactcctg ggttggctgt tgggtagaca ggtggggaaa aggcccgtga 180 

gtcattgtaa gcacaggtcc aacttggccc tgactcotgc gggggtatgg ggaagctgtg 240 

acagaaacga tgggtgotgt ggtcotctgc aggccotcac cccttaactt cctcatacag 300 

aotggcactg ggcagggcct ctcatgtggo agccacatgt ggcgttgtga ggccacocca 360 

tgtggggtct gtggtgagag tcctgtagga tccctgctca agcagcacag aggaaggggc 420 

aagaogtggo ctgtaggcac tgtctoagcc tgcagagaag aaagtgaggc cgggagcctg 480 

20 agcctgggct ggagccttct cccctcocca gttggactag gggcagtgtt aattttgaaa 540 

aggtgtgggt ccctgtgtcc tcttccaggg gtccaaggga acaggagagg tcactgggcc 600 

tgttttctcc ctcctgaccc tgcatctccc accccgtgta tcatagggaa ctttcacctt 660 

aaaatcttta taagcaaagt gtgaatagga tttttactcc otttgtacag tattotgaga 720 

aacgcaaata aaagggcaac atgtttctgt taaaaaaaaa aaaaagtacg caaaaaaaaa 780 
aaaaaaaaaa 



795 



25 

<210> 72 

<211> 2356 

30 <212> DNA 

<213> Homo sapiens 



35 <400> 72 

ggcacgaggc cggaagtgac ctctagagog gtggtgaaac tggcagttga cggctcctgg 60 

gactagatcc cgcgaggtag cccccgaact atttctctac gttttctctt gatcctcccg 120 

aaatcttcca gatccgcgta gtgaggaatc gtatcoaacg tcatgggggg cggagacotg 180 

aatctgaaga agagotggoa cccgcagacc ctoaggaatg tggagaaagt gtggaaggcc 240 

40 gagoagaagc atgaggctga gcggaagaag attgaggagc ttcagcggga gctgcgagaa 300 

gagagagccc gggaagagat gcagcgctat gcggaggatg ttggggccgt caagaaaaaa 360 

gaagaaaagt tggactggat gtaccagggt cctggtggga tggtgaaccg tgacgagtac 420 

ctgctggggc gccccattga caaatatgtt tttgagaaga tggaggagaa ggaggoaggc ' 480 

tgctcttctg aaacaggact tctcccaggc tctatctttg ccccatcagg tgccaattco 540 

cttcttgaca tggccagcaa gatccgggag gacccactct tcatcatcag gaagaaggag 600 

45 gaggagaaaa aacgagaggt attaaataat ccagtgaaaa tgaagaaaat caaagaattg 660 

ttgcaaatga gtctggaaaa aaaggagaag aagaaaaaga aggagaagaa aaagaagcac 720 

aagaaacata agcacagaag ctcgagtagt gatcgttcca gcagcgagga tgagcacagt 780 

gcagggagat cacagaagaa gatggcaaat tcctcccotg ttttgtccaa agtccctgga 840 

tatggcttac aggtccggaa ctctgaccgt aaccagggto ttcagggtcc tctgacagca 900 

so gagcaaaaga gagggcatgg gatgaagaac cattccagat ccagaagctc ctcccactca 960 

cocccaagac atgccagcaa gaagagcacc agggaagcag ggtcccggga caggaggtct 1020 

cgatccctgg gcagaaggtc acggtcccca agacccagca aactgcacaa ctctaaggtg 1080 

aacaggagag agacaggcca aactaggagc ccatcaccta aaaaagaggt ctaccaaagg 1140 

cgacatgctc ccggatacac cagaaaactc tctgcagagg aattagagcg aaaacggcaa 1200 

gagatgatgg aaaacgccaa atggagggag gaggagagac tgaacatcct caagaggcat 1260 

55 gctaaggatg aggaacggga gcagaggcta gagaagctgg actcccggga tgggaagttc 1320 
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atccaccgca tgaagctgga gagtgcatct acttcctccc tggaggatcg ggtgaagcgg 1380 

aatatctact ctttacagag aacttcggta gctctggaga agaactttat gaaaagatga 1440 

aaactgtccc ctctcttatt ggttttcctg oattttccag ggaagctgct gaccccttaa 1500 

5 ttctctttat aagagttcaa atgacttctt tcacagatgt caaaccacca gtgttcaaag 1560 

tgaccctgct tcattgagtc ctgaaacagc tcacttcctt tgagagctag tgtgacttgc 1620 

tttgtgggac actcagtaac tttgggtttt gactctttaa cgggtgggca ctggaccatc 1680 

tcggtgggag tgctfcgtgcc actctggaag gctgttccct ggggttgtga tgtttatcat 1740 

gccacttcot tcttacctgt gccaacagac otatttcaot gcctcagcgt acaccagacc 1800 

10 cttcagaaac ctctctggtg tcacccagat agattgtgct tactgagaca aatgaacgtt 1860 

tacttgattt agaagataat gtgacagaat gatgtcaggt taggtcaaag ccaagggagt 1920 

gacagaatct ggaaaatcaa acaatacaaa aagccctaaa tgaactgtta actatttgat 1980 

ctttggatgt aaaattgtaa tgcgtatatg tacaaatgta caatttttac atgcttttaa 2040 

aaaaggttag ctttgtgaaa ataccttgtt tggtcaatga ctttactggg taatagaacc 2100 

acattgaacc ttgatggoaa gtaatacaat aaggcaggcc agctcgtttt tctctctgaa 2160 

,5 tctggctggt ttaggaggag cctgggttta tcgacgagat atggagtatc tattcttttc 2220 

cactgcttgc agtctccaat gtaggcagtg taaaggtata gtaaaatgat tttaggagtc 2280 

agaaocaaat tgccaatatg ctccatggct cctaaaggaa aataaaatgg aagtttttaa 2340 

aaaaaaaaaa aaaaaa 2356 

20 <210> 73 

<211> 1646 

<212> DHA 
25 <213> Homo sapiens 



<400> 73 

30 gtggaatgtc atcagttaag gctattttca tttcttttgt ggatcttcag ttgcttcagg 60 

ccatctggat gtatacatgc aggtoacagg gaatatgatg gcttagcttg ggttcagagg 120 

cctgacacct caggctgcca aatgtggaag atttaaatac ttgaaccaat accctcctcc 180 

caaaaactga aattggcttc tgtttctgag ttggtccagg cgcaatgttc agcgtatttg 240 

aggaaatcao aagaattgta gttaaggaga tggatgctgg aggggatatg attgccgtta 300 

gaagccttgt tgatgctgat agattccgct gcttccatct ggtgggggag aagagaactt 360 

35 tctttggatg ccggcactac acaacaggcc tcaccctgat ggacattctg gacacacatg 420 

gggacaagtg gttagatgaa ctggattctg ggctccaagg tcaaaaggct gagtttcaaa 480 

ttotggataa tgtagactca acgggagagt tgatagtgag attacccaaa gaaataacaa 540 

tttcaggcag tttccagggc ttccaccatc agaaaatcaa gatatcggag aaccggatat 600 

cccagcagta tctggctacc cttgaaaaca ggaagctgaa gagggaacta cccttttcat 660 

40 tccgatcaat taatacgaga gaaaacctgt atctggtgac agaaactctg gagacggtaa 720 

aggaggaaac cctgaaaagc gaccggcaat ataaattttg gagccagatc tctcagggcc 780 

atotcagcta taaacacaag ggcoaaaggg aagtgaccat ccccccaaat cgggtcctga 840 

gctatcgagt aaagcagctt gtcttcccca acaaggagac gatgagaaag tctttgggtt 900 

cggaggattc cagaaacatg aaggagaagt tggaggacat ggagagtgtc ctcaaggacc 960 

tgacagagga gaagagaaaa gatgtgctaa actccctcgc taagtgcctc ggcaaggagg 1020 

45 atattoggca ggatctagag caaagagtat ctgaggtcct gatttccggg gagctacaca 1080 

tggaggaccc agacaagcct ctcctaagca gcotttttaa tgctgctggg gtcttggtag 1140 

aagogcgtgc aaaagccatt ctggacttcc tggatgccct gctagagctg tctgaagagc 1200 

agcagtttgt ggctgaggcc ctggagaagg ggacccttco tctgttgaag gaccaggtga 1260 

aatctgtcat ggagcagaac tgggatgagc tggocagcag toctcctgac atggactatg 1320 

50 accctgaggc acgaattctc tgtgcgctgt atgttgttgt ctctatcctg ctggagctgg 1380 

ctgaggggcc tacctctgtc tcttcctaac tacaaaagcc ctttctcccc acaagcctct 1440 

gggttttccc tttaccagtc tgtcctcact gccatcgcca ctaccatcct gtcaccagtg 1500 

ggacctcttt aaaacaagca gccaaccatt ctttgatgta tcccattcgc tocatgttaa 1560 

catccaaaac cagcctggat ttcatacatg gacttctgat taaaagtggc aggttgtgca 1620 

tgttaaaaaa aaaaaaaaaa aaaaaa 1646 

55 
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<210> 74 

<211> 3340 

<212> DNA 

<213> Homo sapiens 



10 



<400> 74 

cgggcgccca gagacagcgc cgcctcagat atcctgctgg atgacattgt cottacccat 60 

tctctcttcc tcccgacgga gaaatttctg caggagctac accagtactt tgttcgggca 120 

ggaggcatgg agggccotga agggotgggc cggaagcaag cctgtctagc catgcttctc 180 

15 catttcttgg acacctacca ggggctgctt caagaggaag agggggccgg ccacatcatc 240 

aaggatctat acctgctaat tatgaaggac gagtcccttt accagggcct ccgagaggac 300 

actctgaggc tgcaccagct ggtggagacg gtggaactaa agattccaga ggagaaccag 360 

ccacccagca agcaggtgaa gccactcttc cgccacttcc gccggataga ctcctgtctg 420 

cagacccggg tggcottccg gggctctgat gagatcttct gccgtgrtata catgcctgac 480 

cactcttatg tgaccatacg cagccgcctt tcagcatotg tgcaggacat tctgggctct 540 

gtgacggaga aacttoaata ttcagaggag cccgcggggc gtgaggattc cctcatcctg 600 

gtagctgtgt cctcctatgg agagaaggtc cttctccagc ccactgagga ctgtgttttc 660 

accgcactgg gcatcaacag ccacctgttt gcctgtactc gggacagcta tgaggctctg 720 

gtgcccctcc ccgaggagat ccaggtotcc cctggagaca cagagatcca ccgagtggag 780 

cctgaggacg ttgccaacca cctaactgcc ttccactggg agctgttccg atgtgtgcat 840 

25 gagctggagt tcgtggacta cgtgttccac ggggagcgcg gocgccggga gacggccaac 900 

ttggagctgo tgctgcagcg ctgcagcgag gtcacgcact gggtggcoac cgaagtgctg 960 

ctctgcgagg ccccgggcaa gcgcgcgcag ctgctcaaga agttcatcaa gatcgcggcc 1020 

otctgcaagc agaaccagga cctgctgtct ttctacgccg tggtcatggg gctggacaac 1080 

gccgctgtca gccgccttcg actcacctgg gagaagctgc cagggaaatt caagaaottg 1140 

tttcgoaaat ttgagaacot gacggacccc tgcaggaacc acaaaagcta ccgagaagtg 1200 

atctccaaaa tgaagcoccc tgtgattccc ttcgtgccto tgatcctcaa agacctgact 1260 

ttcotgcacg aagggagtaa gacccttgta gatggtttgg tgaacatcga gaagotgcat 1320 

tcagtggccg aaaaagtgag gacaatccgc aaataccgga gccggcccct ttgcotggac 1380 

atggaggcat cccccaatca cctgcagacc aaggcctatg tgcgccagftt tcaggtcato 1440 

gacaaccaga acatcctctt cgagctctcc tacaagctgg aggcaaacag tcagtgagag 1500 

35 tggaggctcc agtcagaccc gccagatcct tgggoacctg gcactcaagc actttgcacg 1560 

atgtotcaac caacatctga oatctttccc gtggagcaao ttcctgctcc acgggaaaga 1620 

ggtcgatgga tttacccctg gacccataag tctgttcatc ctgctgaagt cccctcccca 1680 

ttgctccttc aagccaaaac tacaatttgc tggttcctgt cccctctgag aaaggggata 1740 

gaaagctcot tcctctatgt cctcccatcg agatctgttc tggggatgga gcttccaact 1800 

. tcctcttgca gcaggaaaga atgctgctca cccttctgtc ttgcagagtg ggattgtggg 1860 

40 agggattggc agccttcttc tccaccacct gtccagcttc ttcctggtca gggctgggac 1920 

ccccaggaat attatgttgc cgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg 1980 

tgtgtcttct tttagggagc aggagtgcat ctggtaattg agggtggatg ttgtgtgtgc 2040 

tggggagggg tccttctgtt tggtgctacc cttgtctact ctgcccctgg atggtgcggg 2100 

gtgctttotc cacccccaca ctccctgctc agctcctcgt gctgccctgc atgcccaggc 2160 

45 ttgtgagcca aggtgctttt tggggcaggg agtagcagca ggtgggaggg gttacccatc 2220 

agcccttgca agtcccccac tcaggcctct ggaaggtcca gggatgggct otgatgagag 2280 

ggtaaaagat gctcagggaa acacaggcct cagotgccta gaggaccctc cccctgcctt 2340 

gcagtgggct cgggtagagc agtatcagga gctagggttg tctgctgccc acactcctgc 2400 

tttttgggat atotaactgc taaggaggga gttgacatcc cccttctggc tcatgtgtct 2460 

gacaccaaca acatggtctc cgtccctctc tcttagactc tccotttgtc ctccccatag 2520 

50 agctggggtg gggtggatcc ctatactggg gcaggcagcc ccaaagbggg ggagggggat 2580 

ggcagagact gtaaaggcgc cactggactc tggcaaggcc tttattacct ttactccctc 2640 

cctctcccat caccagcctc aaggcctgag gggtgcaggg gctcctggca gctaotgggt 2700 

gaggtttcct ggcacagact cacccttctt tctggcacca ctctttccct tttgaagaga 2760 

cagcaacagc cgtagcaaaa gcagctgctg ctcctgctat gagggtgtat atatttttta 2820 

55 cccaaagctc tggaattgta catttatttt ttaaaactca aagagggaaa gagcottgta 2880 
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tcatatgtga acattgtatc ataggtaatg ttgtacagac ccttttatac agtgatctgt 2940 



cttgttcctg cagcaaaaat cctctatgga cataggaggt gctgtgtccc atgccttctt 3.000 
gccctgacag tgtcccatgg gcccccttct gctccctgcc ccctccotgc tactgctgat 3060 
5 gcactgtcat ctccctgcag cacctggctt cccagccttc ctcctgaccc cttccaacag 3120 

cottggaact ccagctgcca ccaccctctg ggtcggacac tgggacccac tggcccagtc 3180 
ttggotgctg cttaccccta gccttgatgc ctgcccaggg acccccagco ccctcccgtfc 3240 
gcootgcagc tttaacagag tgaaccatgt gtattgtaca ggcgcggttg tcattgcaga 3300 
aaccgctggg tggagaagaa gccgataaag tctatgaatc 3340 

10 

<210> 75 
<211> 4005 
<212> DNA 

15 

<213> Homo sapiens 



20 



25 



<400> 75 

gggcaacagt ctgcccacct gtggacacca gatcctggga gctcctggtt agcaagtgag 60 

atctctggga tgtcagtgag gotggttgaa gaccagaggt aaactgcaga ggtcaccacc 120 

cccaccatgt cccaggtgat gtccagccca ctgctggcag gaggccatgc tgtcagcttg 180 

gcgccttgtg atgagcccag gaggaccctg cacccagcac ccagccccag octgccaccc 240 

cagtgttctt actacaccac ggaaggctgg ggagcccagg ccotgatggc ccocgtgocc 300 

tgcatggggc cccctggccg actccagcaa gccccacagg tggaggccaa agccaootgc 360 

ttcctgccgt cccctggtga gaaggccttg gggaccccag aggaccttga ctcctacatt 420 

gacttctcac tggagagcct caatcagatg atcctggaac tggaccccac cttccagctg 480 

cttcccccag ggaotggggg ctcccaggct gagctggccc agagcaccat gtcaatgaga 540 

aagaaggagg aatctgaagc cttggacata aagtacatcg aggrtgacctc cgccagatca 600 

aggtgccacg attggcccca goactgctcc agcccctctg tcaccccgcc cttcggctcc 660 

30 cctcgcagtg gtggcctcot cctttccaga gacgtcccco gagagacacg aagcagcagt 720 

gagagcctca tcttctctgg gaaccagggc agggggoacc agcgccctct gcccccctca 780 
gagggtctct cccctcgacc cccaaattco cccagcatct caatcccttg catggggagc 
aaggcctcga gcccccatgg tttgggctcc ccgctggtgg cttctccaag actggagaag 
cggctgggag gcctggcccc acagcggggc agcaggatct ctgtgctgtc agccagooca 

35 gtgtctgatg tcagctatat gtttggaagc agccagtccc tcotgcactc cagcaactoc 1020 

agccatcagt catcttccag atccttggaa agtccagcca actottcctc cagcctccac 1080 

agccttggot cagtgtccct gtgtacaaga cccagtgact tccaggctcc cagaaacccc 1140 

accctaacca tgggccaacc cagaacaccc cactctccac cactggccaa agaacatgcc 1200 

agcatctgcc ccccatccat caccaactcc atggtggaca tacccattgt gctgatcaac 1260 

ggctgcccag aaccagggtc ttctccaccc cagcggaccc cagga caeca gaactccgtt 1320 

40 caacctggag ctgcttctcc cagcaacccc tgtccagcca ccaggagcaa cagccagacc 1380 

ctgtcagatg ccccctttac cacatgccca gagggtcccg ocagggacat gcagcccacc 1440 

atgaagttcg tgatggacac atctaaatac tggtttaagc caaacatcac ccgagagcaa 1500 

gcaatcgagc tgctgaggaa ggaggageca ggggcttttg tcataaggga cagctcttca 1560 

taccgaggct ccttcggcct ggccctgaag gtgeaggagg ttcccgcgtc tgctcagaat 1620 

45 cgaccaggtg aggacagcaa tgacctcatc cgacacttcc tcatcgagtc gtctgccaaa 1680 

ggagtgcatc tcaaaggagc agatgaggag ccctaotttg ggagcctctc tgccttcgtg 1740 

tgecagcatt ccatcatggc cctggccctg ccotgcaaac t caeca tccc acagagagaa 1800 

ctgggaggtg cagatggggc ctcggactct acagacagcc cagcctcctg ccagaagaaa 1860 

tetgeggget gccacaccct gtabctgagc teagrtgageg tggagacoct gaetggagee 1920 

ctggccgtgc agaaagocat ctcoaccacc tttgagaggg aca tec tccc cacgcccacc 1980 

50 gtggtccact tcgaagtcac agagcagggc atcactctga ctgatgtcca gaggaaggtg 2040 

tttttcegge gccattaccc actcaocacc ctccgcttct gtggtatgga ccctgagcaa 2100 

cggaagtggc agaagtactg caaaccctcc tggatctttg ggtttgtggc caagagecag 2160 

acagagcctc aggagaacgt atgccacctc tttgcggagt atgacatggt ccagccagcc 2220 

tegcaggtea tcggcctggt gaetgetctg ctgcaggacg cagaaaggat gtaggggaga 2280 

55 gaetgectgt gcaootaacc aacacctcca ggggctcget aaggagcccc cctccacccc 2340 



840 
900 
960 
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ctgaatgggt gtggcttgtg gccatattga cagacoaatc tatgggacta gggggattgg 240( 

catcaagttg acacccttga acctgctatg gccttcagca gtcaccatca tccagacccc 24 6( 

ccgggcctca gtttcctcaa tcatagaaga agaccaatag acaagatcag ctgttcttag 252 ( 

5 atgctggtgg gcatttgaac atgotcctcc atgattctga agcatgcaca cctctgaaga 258 ( 

cccctgcatg aaaataacct ccaaggaccc tctgacccca tcgacctggg ccctgcccac 264( 

aoaacagtct gagcaagaga octgcagccc ctgtttcgtg gcagacagca ggtgcctggc 270( 

ggtgacocac ggggctcctg gcttgcagct ggtgatggtc aagaactgac tacaaaacag 27 6( 

gaatggatag actctatttc cttccatatc tgttcatctg ttccttttcc cactttctgg 282( 

10 gtggcttttt gggtccaccc agccaggatg ctgcaggcca agctgggtgt ggtatttagg 288( 

gcagctcagc agggggaact tgtccccatg gtcagaggag aacoagotgt cctgcacccc 294( 

cttgcagatg agtatcaccc catcttttct ttccaattgg tttttatttt tatttttttt 300< 

gagacagagt ctcactgtoa cccaggctga actgcagtgg tgtgatctag gctcactgca 30 6( 

acotccacct cccaggttca agcaattatc ctgcctcagg ctcccgagta gctgggatta 312 ( 

caggcatgtg caactcaccc agctaatttt gtatttttag tagagacagg gtttcaccat 31 8 C 

15 gttggccagg ctggtcttga actcctgacc gcaggtaatc cacotgcttc ggcctcccaa 324 C 

agtggtggga ttacaggcgo aagooaocca gcccagcttc tttccattcc ttgataggcg 33 0C 

agtattcoaa agctggtatc gtagctgccc taatgttgca tattaggcgg cgggggcaga 336C 

gataagggco atctctctgt gattctgcct cage toe tgt ottgetgage cctcccccaa 342C 

ccoacgctcc aacacacaca cacacacaca oacacacaca cacacacaca cacacacaca 34 8C 

2Q cacgcccctc tactgetatg tggcttcaac cagcctcaca gccacacggg ggaagcagag 354C 

agtcaagaat gcaaagaggo cgcttcccta agaggcttgg aggagctggg ctctatccca 360 C 

cacccacccc caccccaccc ccaoccagcc tccagaagct ggaaccattt otcccgcagg 366C 

cotgagttcc taaggaaacc accctaccgg ggtggaaggg agggtcaggg aagaaaccca 372C 

ctcttgctct acgaggagca agtgcctgcc ccctcccagc agccagccct gccaaagttg 37 8 C 

cattatcttt ggecaagget gggcctgacg gttatgattt cagocotggg ectgeaggag 384 C 

25 aggctgagat cagcccaccc agccagtggt cgagcactgc cccgacgcca aagtctgcag 3900 

aatgtgagat gaggttctca aggtcaoagg ccccagtccc agcctggggg ctggoagagg 3960 

cccccatata ctctgctaca gctcctatca tgaaaaataa a a tgt 4005 

<210> 76 

30 <211> 1093 

<212> PRT 

<213> Homo sapiens 

35 



40 



45 



50 



55 



<400> 76 

Mot Lys Glu Met Val Gly Gly Cys Cys Val Cys Sar Asp Glu Arg Gly 

1 5 10 15 

Txp Ala Glu Asn Pro Leu Val Tyr Cys Asp Gly His Ala Cys Ser Val 

20 25 30 

Ala Val His Gin Ala Cys Tyr Gly lie Val Gin Val Pro Thr Gly Pro 

35 40 45 

Trp Phe Cys Arg Lys Cys Glu Ser Gin Glu Arg Ala Ala Arg Val Arg 

50 55 60 

Cys Glu Leu Cys Pro His Lys Asp Gly Ala Leu Lys Arg Thr Asp Asn 
65 70 75 80 

Gly Gly Trp Ala His Val Val Cys Ala Leu Tyr lie Pro Glu Val Gin 

' 85 90 95 

Phe Ala Asn Val Leu Thr Met Glu Pro lie Val Leu Gin Tyr Val Pro 

100 105 110 

His Asp Arg Phe Asn Lys Thr Cys Tyr lie Cys Glu Glu Thr Gly. Arg 

115 120 125 

Glu Ser Lys Ala Ala Ser Gly Ala Cys Mat Thr Cys Asn Arg His Gly 
130 135 140 
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Cys Arg Gin Ala Phe His Val Thr Cys Ala Gin Met Ala Gly Leu Leu 
145 150 155 160 

Cys Glu Glu Glu Val Leu Glu Val Asp Asn Val Lys Tyr Cys Gly Tyr 

165 170 175 

Cys Lys Tyr His Phe Ser Lys Met Lys Thr Ser Arg His Ser Ser Gly 

180 185 190 

Gly Gly Gly Gly Gly Ala Gly Gly Gly Gly Gly Ser Met Gly Gly Gly 

195 200 205 

Gly Ser Gly Phe lie Ser Gly Arg Arg Ser Arg Ser Ala Ser Pro Ser 

210 215 220 

Thr Gin Gin Glu Lys His Pro Thr His His Glu Arg Gly Gin Lys Lys 
225 230 235 240 

Ser Arg Lys Asp Lys Glu Arg Leu Lys Gin Lys His Lys Lys Arg Pro 

245 250 255 

Glu Ser Pro Pro Ser lie Leu Thr Pro Pro Val Val Pro Thr Ala Asp 

260 265 270 

Lys Val Ser Ser Ser Ala Ser Ser Ser Ser His His Glu Ala Ser Thr 

275 280 285 

Gin Glu Thr Ser Glu Ser Ser Arg Glu Ser Lys Gly Lys Lys Ser Ser 

290 295 300 

Ser His Ser Leu Ser His Lys Gly Lys Lys Leu Ser Ser Gly Lys Gly 
305 310 ~ 315 320 

Val Ser Ser Phe Thr Ser Ala Ser Ser Ser Ser Ser Ser Ser Ser Ser 

325 330 335 

Ser Ser Gly Gly Pro Phe Gin Pro Ala Val Ser Ser Leu Gin Ser Ser 

340 345 350 

Pro Asp Phe Ser Ala Phe Pro Lys Leu Glu Gin Pro Glu Glu Asp Lys 

355 360 365 

Tyr Ser Lys Pro Thr Ala Pro Ala Pro Ser Ala Pro Pro Ser Pro Ser 

370 375 380 

Ala Pro Glu Pro Pro Lys Ala Asp Leu Phe Glu Gin Lys Val Val Phe 
385 390 395 400 

Ser Gly Phe Gly Pro He Met Arg Phe Ser Thr Thr Thr . Ser Ser Ser 

405 410 415 

Gly Arg Ala Arg Ala Pro Ser Pro Gly Asp Tyr Lys Ser Pro His Val 

420 425 430 

Thr Gly Ser Gly Ala Ser Ala Gly Thr His Lys Arg Met Pro Ala Leu 

435 440 445 

Ser Ala Thr Pro Val Pro Ala Asp Glu Thr Pro Glu Thr Gly Leu Lys 

450 455 460 

Glu Lys Lys His Lys Ala Ser Lys Arg Ser Arg His Gly Pro Gly Arg 
465 470 475 480 

Pro Lys Gly Ser Arg Asn Lys Glu Gly Thr Gly Gly Pro Ala Ala Pro 

485 490 495 

Ser Leu Pro Ser Ala Gin Leu Ala Gly Phe Thr Ala Thr Ala Ala Ser 

500 505 510 

Pro Phe Ser Gly Gly Ser Leu Val Ser Ser Gly Leu Gly Gly Leu Ser 

515 520 525 

Ser Arg Thr Phe Gly Pro Ser Gly Ser Leu Pro Ser Leu Ser Leu Glu 

530 535 540 

Ser Pro Leu Leu Gly Ala Gly He Tyr Thr Ser Asn Lys Asp Pro He 
545 550 555 560 

Ser His Ser Gly Gly Met Leu Arg Ala Val Cys Ser Thr Pro Leu Ser 

565 " 570 575 

Ser Ser Leu Leu Gly Pro Pro Gly Thr Ser Ala Leu Pro Arg Leu Ser 

580 585 590 

Arg Ser Pro Phe Thr Ser Thr Leu Pro Ser Ser Ser Ala Ser He Ser 
595 600 605 
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Thr Thr Gin Val Phe Ser Leu Ala Gly Ser Thr Phe Ser Leu Pro Ser 

610 615 620 

Thr His lie Phe Gly Thr Pro Hat Gly Ala Val Asn Pro Leu Leu Ser 
625 630 635 640 

Gin Ala Glu Ser Ser His Thr Glu Pro Asp Leu Glu Asp Cys Ser Phe 

645 650 655 

Arg Cys Arg Gly Thr Ser Pro Gin Glu Ser Leu Ser Ser Met Ser Pro 

660 665 670 

lie Ser Ser Leu Pro Ala Leu Phe Asp Gin Thr Ala Ser Ala Pro Cys 

675 680 685 

Gly Gly Gly Gin Leu Asp Pro Ala Ala Pro Gly Thr Thr Asn Mat Glu 

690 "* 695 700 

Gin Leu Leu Glu Lys Gin Gly Asp Gly Glu Ala Gly Val Asn lie Val 
705 710 715 720 

Glu Mat Leu Lys Ala Leu His Ala Leu Gin Lys Glu Asn Gin Arg Leu 

725 730 735 

Gin Glu Gin He Leu Ser Leu Thr Ala Lys Lys Glu Arg Leu Gin He 

740 745 750 

Leu Asn Val Gin Leu Ser Val Pro Phe Pro Ala Leu Pro Ala Ala Leu 

755 760 765 

Pro Ala Ala Asn Gly Pro Val Pro Gly Pro Tyr Gly Leu Pro Pro Gin 

770 775 780 

Ala Gly Ser Ser Asp Ser Leu Ser Thr Ser Lys Ser Pro Pro Gly Lys 
785 790 795 800 

Ser Ser Leu Gly Leu Asp Asn Ser Leu Ser Thr Ser Ser Glu Asp Pro 

805 810 815 

Hi 8 Ser Gly Cys Pro Ser Arg Ser Ser Ser Ser Leu Ser Phe His Ser 

820 825 830 

Thr Pro Pro Pro Leu Pro Leu Leu Gin Gin Ser Pro Ala Thr Leu Pro 

835 840 845 

Leu Ala Leu Pro Gly Ala Pro Ala Pro Leu Pro Pro Gin Pro Gin Asn 

850 ~ 855 860 

Gly Leu Gly Arg Ala Pro Gly Ala Ala Gly Leu Gly Ala Met Pro Met 
865 870 675 .880 

Ala Glu Gly Leu Leu Gly Gly Leu Ala Gly Ser Gly Gly Leu Pro Leu 

885 890 895 

Asn Gly Leu Leu Gly Gly Leu Asn Gly Ala Ala Ala Pro Asn Pro Ala 

900 905 910 

Ser Leu Ser Gin Ala Gly Gly Ala Pro Thr Leu Gin Leu Pro Gly Cys 

915 920 925 

Leu Asn Ser Leu Thr Glu Gin Gin Arg His Leu Leu Gin Gin Gin Glu 

930 935 940 

Gin Gin Leu Gin Gin Leu Gin Gin Leu Leu Ala Ser Pro Gin Leu Thr 
945 950 955 960 

Pro Glu His Gin Thr Val Val Tyr Gin Met He Gin Gin He Gin Gin 

965 970 975 

Lys Arg Glu Leu Gin Arg Leu Gin Met Ala Gly Gly Ser Gin Leu Pro 

980 985 990 

Met Ala Ser Leu Leu Ala Gly Ser Ser Thr Pro Leu Leu Ser Ala Gly 

995 1000 1005 

Thr Pro Gly Leu Leu Pro Thr Ala Ser Ala Pro Pro Leu Leu Pro 

1010 1015 1020 

Ala Gly Ala Leu Val Ala Pro Ser Leu Gly Asn Asn Thr Ser Leu 

1025 1030 1035 

Met Ala Ala Ala Ala Ala Ala Ala Ala Val Ala Ala Ala Gly Gly 

1040 1045 1050 

Pro Pro Val Leu Thr Ala Gin Thr Asn Pro Phe Leu Ser Leu Ser 
1055 1060 1065 
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Gly Ala Glu Gly Ser Gly Gly Gly Pro Lys Gly Gly Thr Ala Asp 
1070 1075 1080 

Lys Gly Ala Ser Ala Asn Gin Glu Lys Gly 
1085 1090 

<210> 77 

<211> 344 

<212> PRT 

<213> Homo sapiens 



<400> 77 

Met His Arg Thr Thr Arg lie Lys lie Thr Glu Leu Asn Pro His Leu 

1 5 10 15 

Met Cys Ala Leu Cys Gly Gly Tyr Phe lie Asp Ala Thr Thr lie Val 

20 25 30 

Glu Cys Leu His Ser Phe Cys I«ys Thr Cys lie Val Arg Tyr Leu Glu 

35 40 45 

Thr Asn Lys Tyr Cys Pro Met Cys Asp Val Gin Val His Lys Thr Arg 

50 55 60 

Pro Leu Leu Ser lie Arg Ser Asp Lys Thr Leu Gin Asp lie Val Tyr 
65 70 75 80 

Lys Leu Val Pro Gly Leu Phe Lys Asp Glu Mat Lys Arg Arg Arg Asp 

85 90 95 

Phe Tyr Ala Ala Tyr Pro Leu Thr Glu Val Pro Asn Gly Ser Asn Glu 

100 105 110 

Asp Arg Gly Glu Val Leu Glu Gin Glu Lys Gly Ala Leu Ser Asp Asp 

115 120 125 

Glu lie Val Ser Leu Ser lie Glu Phe Tyr Glu Gly Ala Arg Asp Arg 

130 135 140 

Asp Glu Lys Lys Gly Pro Leu Glu Asn Gly Asp Gly Asp Lys Glu Lys 
145 150 155 160 

Thr Gly Val Arg Phe Leu Arg Cys Pro Ala Ala Met Thr Val Met His 

165 170 175 

Leu Ala Lys Phe Leu Arg Asn Lys Met Asp Val Pro Ser Lys Tyr Lys 

180 185 190 

Val Glu Val Leu Tyr Glu Asp Glu Pro Leu Lys Glu Tyr Tyr Thr Leu 

195 200 205 

Met Asp lie Ala Tyr lie Tyr Pro Trp Arg Arg Asn Gly Pro Leu Pro 

210 215 220 

Leu Lys Tyr Arg Val Gin Pro Ala Cys Lys Arg Leu Thr Leu Ala Thr 
225 230 235 240 

Val Pro Thr Pro Ser Glu Gly Thr Asn Thr Ser Gly Ala Ser Glu Cys 

245 250 255 

Glu Ser Val Ser Asp Lys Ala Pro Ser Pro Ala Thr Leu Pro Ala Thr 

260 265 270 

Ser Ser Ser Leu Pro Ser Pro Ala Thr Pro Ser His Gly Ser Pro Ser 

275 280 285 

Ser His Gly Pro Pro Ala Thr His Pro Thr Ser Pro Thr Pro Pro Ser 

290 295 300 

Thr Ala Ser Gly Ala Thr Thr Ala Ala Asn Gly Gly Ser Leu Asn Cys 
305 310 315 320 

Leu Gin Thr Pro Ser Ser Thr Ser Arg Gly Arg Lys Met Thr Val Asn 
325 330 335 
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Gly Ala Pro Val Pro Pro Leu Thr 
340 

<210> 78 

<211> 416 

<212> PRT 

<213> Homo sapiens 



<400> 78 

Mat Ser Ser Asn Cys Thr Ser Thr Thr Ala Val Ala Val Ala Pro Leu 

1 5 10-. 15 

Ser Ala Ser Lys Thr Lys Thr Lys Lys Lys His Phe Val Cys Gin Lys 

20 " 25 30 

Val Lys Leu Phe Arg Ala Ser Glu Pro lie Leu Ser Val Leu Met Trp 

35 40 45 

Gly Val Asn His Thr lie Asn Glu Leu Ser Asn Val Pro Val Pro Val 

50 55 60 

Met Leu Met Pro Asp Asp Phe Lys Ala Tyr Ser Lys He Lys Val Asp 
65 70 75 80 

Asn His Leu Phe Asn Lys Glu Asn Leu Pro Ser Arg Phe Lys Phe Lys 

85 90 95 

Glu Tyr Cys Pro Met Val Phe Arg Asn Leu Arg Glu Arg Phe Gly He 

100 105 110 

Asp Asp Gin Asp Tyr Gin Asn Ser Val Thr Arg Ser Ala Pro He Asn 

115 120 125 

Ser Asp Ser Gin Gly Arg Cys Gly Thr Arg Phe Leu Thr Thr Tyr Asp 

130 135 140 

Arg Arg Phe Val He Lys Thr Val Ser Ser Glu Asp Val Ala Glu Met 
145 150 155 160 

His Asn He Leu Lys Lys Tyr His Gin Phe He Val Glu Cys His Gly 

165 170 175 

Asn Thr Leu Leu Pro Gin Phe Leu Gly Met Tyr Arg Leu Thr Val Asp 

180 185 190 

Gly Val Glu Thr Tyr Met Val Val Thr Arg Asn Val Phe Ser His Arg 

195 200 205 

Leu Thr Val His Arg Lys Tyr Asp Leu Lys Gly Ser Thr Val Ala Arg 

210 215 220 

Glu Ala Ser Asp Lys Glu Lys Ala Lys Asp Leu Pro Thr Phe Lys Asp 
225 230 235 240 

Asn Asp Phe Leu Asn Glu Gly Gin Lys Leu His Val Gly Glu Glu Ser 

245 250 255 

Lys Lys Asn Phe Leu Glu Lys Leu Lys Arg Asp Val Glu Phe Leu Ala 

260 265 270 

Gin Leu Lys He Met Asp Tyr Ser Leu Leu Val Gly He His Asp Val 

275 280 285 

Asp Arg Ala Glu Gin Glu Glu Met Glu Val Glu Glu Arg Ala Glu Asp 

290 295 300 

Glu Glu Cys Glu Asn Asp Gly Val Gly Gly Asn Leu Leu Cys Ser Tyr 
305 310 315 320 

Gly Thr Pro Pro Asp Ser Pro Gly Asn Leu Leu Ser Phe Pro Arg Phe 

325 330 335 

Phe Gly Pro Gly Glu Phe Asp Pro Ser Val Asp Val Tyr Ala Met Lys 
340 345 350 



178 



EP 1 365 034 A2 



Sor Hie Glu Ser Ser Pro Lys Lys 
355 360 
Asp He Leu Thr Pro Tyr Asp Thr 

370 375 
Lys Thr Val Lys His Gly Ala Gly 
365 390 
Glu Gin Tyr Ser Lys Arg Phe Asn 
405 

<210> 79 

<211> 500 

<212> PRT 

<213> Homo sapiens 



<400> 79 
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Arg 
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Ser 


He Phe Glu 
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275 
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Lys 
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Met Ser Ala 
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290 






295 





Glu Val Tyr Phe Met Ala 


He 


He 
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380 
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Lou Gin His Arg Ser Cys Asp Ala Cya Met Ser Sor Asp Leu Thr Phe 
305 310 315 320 

Asn Cys Ser Trp Cys His Val Leu Gin Arg Cys Ser Ser Gly Phe Asp 

325 330 335 

Arg Tyr Arg Gin Glu Trp Met Asp Tyr Gly Cys Ala Gin Glu Ala Glu 

340 345 350 

Gly Arg Met Cys Glu Asp Phe Gin Asp Glu Asp His Asp Ser Ala Ser 

355 360 365 

Pro Asp Thr Ser Phe Ser Pro Tyr Asp Gly Asp Leu Thr Thr Thr Ser 

370 375 380 

Ser Ser Leu Phe He Asp Ser Leu Thr Thr Glu Asp Asp Thr Lys Leu 
385 390 395 400 

Asn Pro Tyr Ala Gly Gly Asp Gly Leu Gin Asn Asn Leu Ser Pro Lys 

405 410 415 

Thr Lys Gly Thr Pro Val His Lou Gly Thr He Val Gly He Val Leu 

420 425 430 

Ala Val Leu Leu Val Ala Ala He He Leu Ala Gly He Tyr He Asn 

435 440 445 

Gly His Pro Thr Ser Asn Ala Ala Leu Phe Pho He Glu Arg Arg Pro 

450 455 460 

His His Trp Pro Ala Met Lys Phe Arg Ser His Pro Asp His Ser Thr 
465 470 475 480 

Tyr Ala Glu Val Glu Pro Ser Gly His Glu Lys Glu Gly Phe Met Glu 
485 490 495 

Ala Glu Gin Cys 
500 

<210> 80 

<211> 509 

<212> PRT 

<213> Homo sapiens 



<400> 80 

Met Glu Asp He Gin Thr Asn Ala Glu Leu Lys Ser Thr Gin Glu Gin 

1 5 10 15 

Ser Val Pro Ala Glu Ser Ala Ala Val Leu Asn Asp Tyr Ser Leu Thr 

20 25 30 

Lys Ser His Glu Met Glu Asn Val Asp Ser Gly Glu Gly Pro Ala Asn 

35 40 45 

Glu Asp Glu Asp He Gly Asp Asp Ser Met Lys Val Lys Asp Glu Tyr 

50 55 60 

Ser Glu Arg Asp Glu Asn Val Leu Lys Ser Glu Pro Met Gly Asn Ala 
65 70 75 80 

Glu Glu Pro Glu He Pro Tyr Ser Tyr Ser Arg Glu Tyr Asn Glu Tyr 

85 90 95 

Glu Asn He Lys Leu Glu Arg His Val Val Ser Phe Asp Sar Ser Arg 

100 105 HO 

Pro Thr Ser Gly Lys Met Asn Cys Asp Val Cys Gly Leu Ser Cys He 

115 120 125 

Ser Phe Asn Val Leu Met Val His Lys Arg Ser His Thr Gly Glu Arg 

130 135 140 

Pro Phe Gin Cys Asn Gin Cys Gly Ala Ser Phe Thr Gin Lys Gly Asn 
145 150 155 160 
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Leu Leu Arg His lie Lys Leu His Thr Gly Glu Lys Pro Phe Lys Cys 

165 170 175 

His Leu Cys Asn Tyr Ala Cys Gin Arg Arg Asp Ala Leu Thr Gly His 

180 185 190 

Leu Arg Thr His Ser Val Glu Lys Pro Tyr Lys Cys Glu Phe Cys Gly 

195 200 205 

Arg Ser Tyr Lys Gin Arg Ser Ser Leu Glu Glu His Lys Glu Arg Cys 

210 215 220 

Arg Thr Phe Leu Gin Ser Thr Asp Pro Gly Asp Thr Ala Ser Ala Glu 
225 230 235 240 

Ala Arg His lie Lys Ala Glu Mat Gly Ser Glu Arg Ala Leu Val Leu 

245 250 255 

Asp Arg Leu Ala Ser Asn Val Ala Lys Arg Lys Ser Ser Mat Pro Gin 

260 265 270 

Lys Phe He Gly Glu Lys Arg His Cys Phe Asp Val Asn Tyr Asn Ser 

275 280 285 

Ser Tyr Met Tyr Glu Lys Glu Ser Glu Leu He Gin Thr Arg Met Met 

290 295 300 

Asp Gin Ala He Asn Asn Ala He Ser Tyr Leu Gly Ala Glu Ala Leu 
305 ■ 310 315 320 

Cys Pro Leu Val Gin Thr Pro Pro Ala Pro Thr Ser Glu Met Val Pro 

325 330 335 

Val He Ser Ser Mat Tyr Pro He Ala Leu Thr Arg Ala Glu Mat Ser 

340 345 350 

Asn Gly Ala Pro Gin Glu Leu Glu Arg Lys Ser He Leu Leu Pro Glu 

355 360 365 

Lys Ser Val Pro Ser Glu Arg Gly Leu Ser Pro Asn Asn Ser Gly His 

370 375 380 

Asp Ser Thr Asp Thr Asp Ser Asn His Glu Glu Arg Gin Asn His He 
385 390 395 ^ 400 

Tyr Gin Gin Asn His Met Val Leu Ser Arg Ala Arg Asn Gly Met Pro 

405 410 415 

Leu Leu Lys Glu Val Pro Arg Ser Tyr Glu Leu Leu Lys Pro Pro Pro 

420 425 • 430 

He Cys Pro Arg Asp Ser Val Lys Val He Asp Lys Glu Gly Glu Val 

435 440 ~ 445 

Mat Asp Val Tyr Arg Cys Asp His Cys Arg Val Leu Phe Leu Asp Tyr 

450 455 460 

Val Met Phe Thr He His Met Gly Cys His Gly Phe Arg Asp Pro Phe 
465 470 475 480 

Glu Cys Asn Met Cys Gly Asp Arg Ser His Asp Arg Tyr Glu Phe Ser 

485 490 495 

Ser His He Ala Arg Gly Glu His Arg Ser Leu Leu Lys 
500 505 

<210> 81 

<211> 440 

<212> PRT 

<213> Homo sapiens 



181 



EP 1 365 034 A2 



<400> 81 

Mat Pro lie Pro Pro Pro Pro Pro Pro Pro Pro Gly Pro Pro Pro Pro 

1 5 10 15 

Pro Thr Pho His Gin Ala Asn Thr Glu Gin Pro Lys Leu Ser Arg Asp 

20 25 30 

Glu Gin Arg Gly Arg Gly Ala Leu Leu Gin Asp lie Cys Lys Gly Thr 

35 40 45 

Lys Leu Lys Lys Val Thr Asn He Asn Asp Arg Ser Ala Pro He Leu 

50 55 60 

Glu Lys Pro Lys Gly Ser Ser Gly Gly Tyr Gly Ser Gly Gly Ala Ala 
65 70 75 80 

Leu Gin Pro Lys Gly Gly Leu Phe Gin Gly Gly Val Leu Lys Leu Arg 

85 90 95 

Pro Val Gly Ala Lys Asp Gly Ser Glu Asn Leu Ala Gly Lys Pro Ala 

100 105 110 

Leu Gin He Pro Ser Ser Arg Ala Ala Ala Pro Arg Pro Pro Val Ser 

115 120 125 

Ala Ala Ser Gly Arg Pro Gin Asp Asp Thr Asp Ser Ser Arg Ala Ser 

130 135 140 

Leu Pro Glu Leu Pro Arg Met Gin Arg Pro Ser Leu Pro Asp Leu Ser 
145 150 155 160 

Arg Pro Asn Thr Thr Ser Ser Thr Gly Met Lys His Ser Ser Ser Ala 

165 170 175 

Pro Pro Pro Pro Pro Pro Gly Arg Arg Ala Asn Ala Pro Pro Thr Pro 

180 185 190 

Leu Pro Met His Ser Ser Lys Ala Pro Ala Tyr Asn Arg Glu Lys Pro 

195 200 205 

Leu Pro Pro Thr Pro Gly Gin Arg Leu His Pro Gly Arg Glu Gly Pro 

210 215 220 

Pro Ala Pro Pro Pro Val Lys Pro Pro Pro Ser Pro Val Asn lie Arg 
225 230 235 240 

Thr Gly Pro Ser Gly Gin Ser Leu Ala Pro Pro Pro Pro Pro Tyr Arg 

245 250 255 

Gin Pro Pro Gly Val Pro Asn Gly Pro Ser Ser Pro Thr Asn Glu Ser 

260 265 270 

Ala Pro Glu Leu Pro Gin Arg His Asn Ser Leu His Arg Lys Thr Pro 

275 280 285 

Gly Pro Val Arg Gly Leu Ala Pro Pro Pro Pro Thr Ser Ala Ser Pro 

290 295 300 

Ser Leu Leu Ser Asn Arg Pro Pro Pro Pro Ala Arg Asp Pro Pro Ser 
305 .310 315 320 

Arg Gly Ala Ala Pro Pro Pro Pro Pro Pro Val He Arg Asn Gly Ala 

325 330 335 

Arg Asp Ala Pro Pro Pro Pro Pro Pro Tyr Arg Met His Gly Ser Glu 

340 345 350 

Pro Pro Ser Arg Gly Lys Pro Pro Pro Pro Pro Ser Arg Thr Pro Ala 

355 360 365 

Gly Pro Pro Pro Pro Pro Pro Pro Pro Leu Arg Asn Gly His Arg Asp 

370 375 380 

Ser He Thr Thr Val Arg Ser Phe Leu Asp Asp Phe Glu Ser Lys Tyr 
385 390 395 400 

Ser Phe His Pro Val Glu Asp Phe Pro Ala Pro Glu Glu Tyr Lys His 

. 405 410 415 

Phe Gin Arg He Tyr Pro Ser Lys Thr Asn Arg Ala Ala Arg Gly Ala 

420 425 430 

Pro Pro Leu Pro Pro He Leu Arg 
435 . 440 
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<210> 82 

<211> 205 

<212> PRT 
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<400> 82 

Met Ser lie Met Ser Tyr Asn Gly Gly Ala Val Met Ala Met Lys Gly 

1 5 10 15 

Lys Asn Cys Val Ala lis Ala Ala Asp Arg Arg Phe Gly He Gin Ala 

20 25 30 

Gin Met Val Thr Thr Asp Phe Gin Lys He Phe Pro Met Gly Asp Arg 

35 40 45 

Leu Tyr lie Gly Leu Ala Gly Leu Ala Thr Asp Val Gin Thr Val Ala 

50 55 60 

Gin Arg Leu Lys Phe Arg Leu Asn Leu Tyr Glu Leu Lys Glu Gly Arg 
65 70 75 80 

Gin He Lys Pro Tyr Thr Leu Met Ser Met Val Ala Asn Leu Leu Tyr 

85 90 95 

Glu Lys Arg Phe Gly Pro Tyr Tyr Thr Glu Pro Val He Ala Gly Leu 

100 105 110 

Asp Pro Lys Thr Phe Lys Pro PhQ Ho Cys Ser Lou Asp Leu He Gly 

115 120 125 

Cys Pro Met Val Thr Asp Asp Phe Val Val Ser Gly Thr Cys Ala Glu 

130 135 140 

Gin Met Tyr Gly Met Cys Glu Ser Leu Trp Glu Pro Asn Met Asp Pro 
145 150 155 160 

Asp His Leu Phe Glu Thr He Ser Gin Ala Met Leu Asn Ala Val Asp 

165 170 175 

Arg Asp Ala Val Ser Gly Met Gly Val He Val His He He Glu Lys 

180 185 190 

Asp Lys He Thr Thr Arg Thr Leu Lys Ala Arg Met Asp 
195 200 205 

<210> 83 

<211> 190 

<212> PRT 

<213> Homo sapiens 



<400> 83 

Leu Thr Arg Ser Cys Ser Thr Cys Cys Pro Ala Val Ala Cys Leu Val 

15 10 15 

Gly Arg Gly Val Val Thr Ser Gly Ala Met His Gin Cys Trp Gly Glu 

20 25 30 

Glu Mat Leu Gin Gly Met Leu Leu Trp Gly Trp Ala Thr Cys Pro Leu 

35 40 45 

Ser Asn Pro Gly Arg Trp Gly Arg Thr Val Gly Leu Gin His Pro Ala 

50 55 60 

Val Val Ser Ala Phe Arg Ala Leu Leu Leu Leu Met Leu Thr Val His 
65 70 75 80 
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Val Ser Tyr Leu Ser Leu lie Arg Phe Asp Tyr Gly Tyr Asn Leu Val 

85 90 95 

Ala Asn Val Ala lie Gly Leu Val Asn Val Val Trp Trp Leu Ala Trp 

100 105 110 

Cys Leu Trp Asn Gin Arg Arg Leu Pro His Val Arg Lys Cys Val Val 

115 120 125 

Val Val Leu Leu Leu Gin Gly Leu Ser Leu Leu Glu Leu Leu Asp Phe 

130 135 140 

Pro Pro Leu Phe Trp Val Leu Asp Ala His Ala lie Trp His lie Ser 
145 150 155 160 

Thr lie Pro Val His Val Leu Phe Phe Ser Phe Leu Glu Asp Asp Ser 

165 170 175 

Leu Tyr Leu Leu Lys Glu Ser Glu Asp Lys Phe Lys Leu Asp 
180 185 190 

<210> 84 

<211> 368 •> 

<212> PRT 

<213> Homo sapiens 



<400> 84 

Ala Pro Pro Pro Ala Ala Ser Gin Gly Glu Arg Met Ala Gly Leu Ala 

15 10 15 

Ala Arg Leu Val Leu Leu Ala Gly Ala Ala Ala Leu Ala Ser Gly Ser 

20 25 30 

Gin Gly Asp Arg Glu Pro Val Tyr Arg Asp Cys Val Leu Gin Cys Glu 

35 40 45 

Glu Gin Asn Cys Ser Gly Gly Ala Leu Asn His Phe Arg Ser Arg Gin 

50 55 60 

Pro lie Tyr Met Ser Leu Ala Gly Trp Thr Cys Arg Asp Asp Cys Lys 
65 70 75 80 

Tyr Glu Cys Met Trp Val Thr Val Gly Leu Tyr Leu Gin Glu Gly His 

85 90 95 

Lys Val Pro Gin Phe His Gly Lys Trp Pro Phe Ser Arg Phe Leu Phe 

100 105 110 

Phe Gin Glu Pro Ala Ser Ala Val Ala Ser Phe Leu Asn Gly Leu Ala 

115 120 125 

Ser Leu Val Met Leu Cys Arg Tyr Arg Thr Phe Val Pro Ala Ser Ser 

130 135 140 

Pro Mat Tyr His Thr Cys Val Ala Phe Ala Trp Val Ser Leu Asn Ala 
145 150 155 160 

Trp Phe Trp Ser Thr Val Phe His Thr Arg Asp Thr Asp Leu Thr Glu 

165 170 175 

Lys Met Asp Tyr Phe Cys Ala Ser Thr Val lie Leu His Ser lie Tyr 

180 185 190 

Leu Cys Cys Val Arg Thr Val Gly Leu Gin His Pro Ala Val Val Ser 

195 200 205 

Ala Phe Arg Ala Leu Leu Leu Leu Met Leu Thr Val His Val Ser Tyr 

210 215 220 

Leu Ser Leu lie Arg Phe Asp Tyr Gly Tyr Asn Leu Val Ala Asn Val 
225 230 235 240 

Ala lie Gly Leu Val Asn Val Val Trp Trp Leu Ala Trp Cys Leu Trp 
245 250 255 



184 
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Val 
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Val 


His 


Val 


Leu Phe 


Phe Ser 


Phe 
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Lys 


Glu 


Ser Glu 


Asp Lys 


Phe 






325 






Phe 


Ala 
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Pro Leu 


Thr Pro 


Cys 








340 






Ala 


Arg 


Thr 


Pro Thr 


Ser Gly Thr 




355 






360 
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<211> 190 

<212> PRT 
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Arg Lys Cys Val 


Val Val 


Val 


Leu 


265 


270 






Glu Leu Leu Asp 


Phe Pro 


Pro 


Leu 




285 






He Trp His He 


Ser Thr 


He 


Pro 


300 








Leu Glu Asp Asp 


Ser Leu 


Tyr 


Leu 


315 






320 


Lys Leu Val Glu 


Ala Asp 


Trp 


He 


330 




335 




Pro Ser Leu Arg 


Glu Gly 


Ser 


Tyr 


345 


350 






Arg Val Ala Cys 


Ala Ser 


Phe 


Phe 


365 
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Leu 


Thr 


Arg 


Ser Cys 


Ser Thr Cys 
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Gly 


Arg 


Gly 


Val Val 


Thr Ser Gly 








20 




Glu 


Met 


Leu 


Gin Gly 


Hat Leu Leu 






35 




40 


Ser 


Asn 


Pro 


Gly Arg 


Trp Gly Arg 




50 






55 


Val 


Val 


Ser 


Ala Phe 


Arg Ala Leu 


65 








70 


Val 


Ser 


Tyr 


Leu Ser 


Leu He Arg 








85 




Ala 


Asn 


Val 


Ala He 


Gly Leu Val 








100 




Cys 


Leu 


Trp 


Asn Gin 


Arg Arg Leu 




115 




120 


Val 


Val 


Leu 


Leu Leu 


Gin Gly Leu 




130 






135 


Pro 


Pro 


Leu 


Phe Trp 


Val Leu Asp 


145 








150 


Thr 


lie 


Pro 


Val His 


Val Leu Phe 








165 




Leu 


Tyr 


Leu 


Leu Lys 


Glu Ser Glu 






180 
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Cys Pro Ala Val 


Ala Cys 


Leu 


Val 


10 




15 




Ala Met His Gin 


Cys Trp 


Gly 


Glu 


25 


30 






Trp Gly Trp Ala 


Thr Cys 


Pro 


Leu 




45 






Thr Val Gly Leu 


Gin His 


Pro 


Ala 


60 








Leu Leu Leu Met 


Leu Thr 


Val 


His 


75 






80 


Phe Asp Tyr Gly 


Tyr Asn 


Leu 


Val 


90 




95 




Asn Val Val Trp 


Trp Leu 


Ala 


Trp 


105 


110 






Pro His Val Arg 


Lys Cys 


Val 


Val 




125 






Ser Leu Leu Glu 


Leu Leu 


Asp 


Phe 


140 








Ala His Ala He 


Trp His 


He 


Ser 


155 






160 


Phe Ser Phe Leu 


Glu Asp 


Asp 


Ser 


170 




175 




Asp Lys Phe Lys 


Leu Asp 






185 


190 
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<213> Homo sapiens 



<400> 86 




Mat 


Ala 


Gly Leu Ala 


Ala Arg Leu 
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Lou 


Ala 


Ser Gly Ser 


Gin Gly Asp 
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Val 


Leu 


Gin Cys Glu 


Glu Gin Asn 






35 


40 


Phe 


Arg 


Ser Arg Gin 


Pro He Tyr 
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55 


Arg 


Asp 


Asp Cys Lys 


Tyr Glu Cys 
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70 


Lou 


Gin 


Glu Gly His 


Lys Val Pro 






85 




Sor 


Arg 


Phe Leu Phe 


Phe Gin Glu 
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Gly Leu Ala 


Ser Leu Val 
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Val 
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Ala Ser Ser 


Pro Mat Tyr 
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Val 


Ser 
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145 
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Thr 
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Lys Trp Thr 
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Ser Thr Cys 


Ala Ala Ser 
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Ser Ser Ala 


Phe Arg Ala 




195 
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Val 
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Gin Arg Arg 




245 




Val 
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260 




Phe 
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Pro Leu Phe 


Trp Val Leu 






275 


280 


Ser 


Thr 


He Pro Val 


His Val Leu 




290 




295 


Ser 


Leu 


Tyr Leu Leu 


Lys Glu Ser 


305 






310 
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Val 
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Phe Phe Ser 
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Glu 


Asp 


Asp 




300 
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315 
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<400> 87 



Met 


Ala 


Gly 


Leu Ala Ala Arg Leu 
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Ala 


Ser 


Gly Ser Gin Gly Asp 








20 


Val 


Leu 


Gin 


Cys Glu Glu Gin Asn 






35 


40 


Phe 


Arg 


Ser 


Arg Gin Pro lie Tyr 




50 




55 


Arg 


Asp 


Asp 


Cys Lys Tyr Glu Cys 
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70 


Leu 


Gin 


Glu 


Gly His Lys Val Pro 








85 
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Arg 
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Leu Phe Phe Gin Glu 








100 


Leu 


Asn 


Gly 


Leu Ala Ser Leu Val 
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Val 
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130 
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Val 
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Val 


Ala 


Gly 
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Ala 
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<400> 88 






Met Ala 


Gly 


Leu 


Ala Ala Arg Leu 


1 






5 


Leu Ala 


Ser 


Gly 


Ser Gin Gly Asp 






20 




Val Leu 


Gin 


Cys 


Glu Glu Gin Asn 




35 




40 


Phe Arg 


Ser 


Arg 


Gin Pro He Tyr 


50 






55 


Arg Asp 


Asp 


Cys 


Lys Tyr Glu Cys 


65 






70 


Leu Gin 


Glu 


Gly 


His Lys Val Pro 








85 


Ser Arg 


Phe 


Leu 


Phe Phe Gin Glu 




100 




Leu Asn 


Gly 


Leu 


Ala Ser Leu Val 




115 




120 



Val Leu Leu Ala 


Gly 


Ala Ala 


A±a 


10 




15 




Arg Glu Pro Val 


Tyr 


Arg Asp 


Cys 


25 




30 




Cys Ser Gly Gly 


Ala 


Leu Asn 


HIS 




45 






Met Ser Leu Ala 


Gly 


Trp Thr 


cys 


60 








Met Trp Val Thr 


Val 


Gly Leu 


Tyr 


75 






80 


Gin Phe His Gly 


Lys 


Trp Pro 


Phe 


90 




95 




Pro Ala Ser Ala 


Val 


Ala Ser 


Pne 


105 




110 




Met Leu Cys Arg 


Tyr 


Arg Thr 


Phe 




125 






His Thr Cys Val 


Ala 


Phe Ala 


Trp 


140 








Ser Thr Val Phe 


His 


Thr Arg 


Asp 


155 






160 


Tyr Phe Cys Ala 


Ser 


Thr Val 


He 


170 




175 




Val Arg Pro Gly 


Gin 


Arg Gly 


Val 


185 




190 




Pro Ala Ala Ala 


Ala 


Ser Arg 
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75 
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Trp Pro 


Phe 
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Ala Ser 
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Tyr Arg Thr 
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125 
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Ser 
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Val 
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Val 
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Val Ala He Gly 


225 








230 
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Trp Asn Gin Arg 










245 


Val 


Val 


Val 


Val 
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260 




Asp 
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Pro 


Pro 


Leu Phe Trp Val 




275 




280 


He 
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Thr 


He 


Pro Val His Val 




290 






295 


Asp 


Ser 


Leu Tyr 
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305 








310 


<210> 89 







His Thr Cys 


Val 


Ala Phe Ala Trp 




140 




Ser Thr Val 


Phe 


His Thr Arg Asp 


155 




160 


Tyr Phe Cys 


Ala 


Ser Thr Val He 


170 




175 


Val Arg Thr 


Val 


Gly Leu Gin His 


185 




190 


Ala Leu Leu 


Leu 


Leu Met Leu Thr 






205 


He Arg Phe 


Asp 


Tyr Gly Tyr Asn 




220 




Leu Val Asn 


Val 


Val Trp Trp Leu 


235 




240 


Arg Leu Pro 


His 


Val Arg Lys Cys 


250 




255 


Gly Leu Ser 


Leu 


Leu Glu Leu Leu 


265 




270 


Leu Asp Ala 


His 


Ala He Trp His 




285 


Lau Phe Phe 


Ser 


Phe Leu Glu Asp 




300 


Ser Glu Asp 


Lys 


Phe Lys Leu Asp 


315 




320 



<211> 217 
<212> PRT 
<213> Homo sapiens 



<400> 89 

Ala Pro Pro Pro Ala Ala Ser Gin Gly Glu Arg Mat Ala Gly Leu Ala 

15 10 15 

Ala Arg Leu Val Leu Leu Ala Gly Ala Ala Ala Leu Ala Ser Gly Ser 

20 25 30 

Gin Gly Asp Arg Glu Pro Val Tyr Arg Asp Cys Val Leu Gin Cys Glu 

35 40 45 

Glu Gin Asn Cys Ser Gly Gly Ala Leu Asn His Phe Arg Ser Arg Gin 

50 "* 55 60 

Pro He Tyr Met Ser Leu Ala Gly Trp Thr Cys Arg Asp Asp Cys Lys 
65 70 75 80 

Tyr Glu Cys Met Trp Val Thr Val Gly Leu Tyr Leu Gin Glu Gly His 

85 90 95 

Lys Val Pro Gin Phe His Gly Lys Trp Pro Phe Ser Arg Phe Leu Phe 

100 105 110 

Phe Gin Glu Pro Ala Ser Ala Val Ala Ser Phe Leu Asn Gly Leu Ala 

115 120 125 

Ser Leu Val Met Leu Cys Arg Tyr Arg Thr Phe Val Pro Ala Ser Ser 

130 135 140 

Pro Met Tyr His Thr Cys Val Ala Phe Ala Trp Val Ser Leu Asn Ala 
145 150 155 160 

Trp Phe Trp Ser Thr Val Phe His Thr Arg Asp Thr Asp Leu Thr Glu 
165 170 175 
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Lys Met Asp Tyr Phe Cys Ala Ser Thr Val lie Leu His Ser lie Tyr 

180 185 190 

Lou Cys Cys Val Ser Phe Leu Glu Asp Asp Ser Leu Tyr Leu Leu Lys 

195 200 205 

Glu Ser Glu Asp Lys Phe Lys Leu Asp 
210 215 



<210> 


90 


<2U> 


153 


<212> 


PRT 


<213> 


Homo sapiens 



<400> 90 

Mat Asn Val Gly Thr Ala His Ser Glu Val Asn Pro Asn Thr Arg Val 

1 5 10 15 

Met Asn Ser Arg Gly He Trp Leu Ser Tyr Val Leu Ala lie Gly Leu 

20 25 30 

Leu His He Val Leu Leu Ser He Pro Phe Val Ser Val Pro Val Val 

35 40 45 

Trp Thr Leu Thr Asn Leu He His Asn Met Gly Met Tyr He Phe Leu 

50 55 . 60 

His Thr Val Lya Gly Thr Pro Phe Glu Thr Pro Asp Gin Gly Lys Ala 
65 70 75 80 

Arg Leu Leu Thr His Trp Glu Gin Mat Asp Tyr Gly Val Gin Phe Thr 

85 90 95 

Ala Ser Arg Lys Phe Leu Thr He Thr Pro He Val Leu Tyr Phe Leu 

100 105 HO 

Thr Ser Phe Tyr Thr Lys Tyr Asp Gin He His Phe Val Leu Asn Thr 

115 120 125 

Val Ser Leu Mat Ser Val Leu He Pro Lys Leu Pro Gin Leu His Gly 

130 135 140 

Val Arg He Phe Gly He Asn Lys Tyr 
145 150 

<210> 91 

<211> 436 



<212> PRT 

<213> Homo sapiens 



<400> 91 

Met Arg Arg Asp Val Asn Gly Val Thr Lys Ser Arg Phe Glu Met Phe 

15 10 15 

Ser Asn Ser Asp Glu Ala Val He Asn Lys Lys Leu Pro Lys Glu Leu 

20 25 30 

Leu Leu Arg He Phe Ser Phe Leu Asp Val Val Thr Leu Cys Arg Cys 

35 40 45 

Ala Gin Val Ser Arg Ala Trp Asn Val Leu Ala Leu Asp Gly Ser Asn 
50 55 60 
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Glu 
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Cys Val Gin He 


305 








310 


Ser 


He 


His 
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Pro Arg Leu Gin 
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He 
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Asp 


Asp Gly He Arg 
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355 
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Gly 


Ala 






60 






Leu 


Val 


Phe Ser Lys Leu 


Glu 


Asn 






75 




80 


Leu 


He 


Glu Ala He Arg 


Arg 


Ala 




90 




95 




He 


Thr 


Asn Ser Arg Pro 


Pro 


Cys 


105 




110 
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<211> 144 
<212> PRT 
<213> Homo sapiens 



<400> 94 




Mat 


Gly 


Ala Val Val 


Leu Cys Arg 


1 


5 




Gin 


Thr 


Gly Thr Gly 


Gin Gly Leu 






20 




Cys 


Glu 


Ala Thr Pro 


Cys Gly Val 




35 


40 


Leu 


Leu 


Lys Gin His 


Arg Gly Arg 




50 




55 


Val 


Ser 


Ala Cys Arg 


Glu Glu Ser 


65 






70 


Trp 


Ser 


Leu Leu Pro 


Ser Pro Val 




85 




Lys 


Arg 


Cys Gly Ser 


Leu Cys Pro 






100 




Arg 


Gly 


His Trp Ala 


Cys Phe Leu 






115 


120 


Pro 


Cys 


lie lie Gly 


Asn Phe His 




130 




135 



<210> 95 
<211> 425 



<212> PRT 

<213> Homo sapiens 



Pro Ser Pro 


Leu 


Asn 


Phe Leu lie 


10 






15 


Ser Cys Gly 


Ser 


His 


Met Trp Arg 


25 






30 


Cys Gly Glu 


Ser 


Pro 


Val Gly Ser 






45 




Gly Lys Thr 


Trp 


Pro 


Val Gly Thr 




60 






Glu Ala Gly 


Ser 


Leu 


Ser Leu Gly 


75 






80 


Gly Leu Gly Ala 


Val 


Leu lie Leu 


90 






95 


Leu Pro Gly Val 


Gin 


Gly Asn Arg 


105 






110 


Pro Pro Asp 


Pro 


Ala 


Ser Pro Thr 






125 




Leu Lys lie 


Phe 


Leu 


Ser Lys Val 




140 







<400> 95 




Met 


Gly 


Gly Gly 


Asp 


1 






5 


Leu 


Arg 


Asn Val 


Glu 






20 




Glu 


Arg 


Lys Lys 


He 






35 




Ala 


Arg 


Glu Glu 


Mat 




50 






Lys 


Lys 


Glu Glu 


Lys 


65 








Val 


Asn 


Arg Asp 


Glu 








85 


Phe 


Glu 


Lys Met 


Glu 






100 




Leu 


Leu 


Pro Gly 


Ser 






115 




Asp 


Met 


Ala Ser 


Lys 




130 






Lys 


Glu 


Glu Glu 


Lys 



145 



Leu Asn Leu Lys Lys Ser 
10 

Lys Val Trp Lys Ala Glu 
25 

Glu Glu Leu Gin Arg Glu 
40 

Gin Arg Tyr Ala Glu Asp 
55 

Leu Asp Trp Met Tyr Gin 
70 75 
Tyr Leu Leu Gly Arg Pro 
90 

Glu Lys Glu Ala Gly Cys 
105 

He Phe Ala Pro Ser Gly 
120 

He Arg Glu Asp Pro Leu 
135 

Lys Arg Glu Val Leu Asn 
150 155 



Trp 


His 


Pro Gin Thr 






15 


Gin 


Lys 


His Glu Ala 






30 


Leu 


Arg 


Glu Glu Arg 




45 




Val 


Gly 


Ala Val Lys 


60 






Gly 


Pro 


Gly Gly Met 






80 


He 


Asp 


Lys Tyr Val 






95 


Ser 


Ser 


Glu Thr Gly 






110 


Ala 


Asn 


Ser Leu Leu 




125 




Phe 


He 


He Arg Lys 


140 






Asn 


Pro 


Val Lys Met 






160 
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Lys 


Lys 


lie Lys 


Glu Leu Leu Gin 








165 


Lys 


Lys 


Lys Lys 


Glu Lys Lys Lys 






180 




Ser 


Ser 


Ser Ser 


Asp Arg Ser Ser 






195 


200 


Arg 


Ser 


Gin Lys 


Lys Met Ala Asn 


210 




215 


Pro 


Gly 


Tyr Gly 


Leu Gin Val Arg 


225 






230 


Gin 


Gly 


Pro Leu 


Thr Ala Glu Gin 








245 


His 


Ser 


Arg Ser 


Arg Ser Ser Ser 






260 




Lys 


Lys 


Ser Thr 


Arg Glu Ala Gly 






275 


280 


Leu 


Gly 


Arg Arg 


Ser Arg Ser Pro 




290 




295 


Lys 


Val 


Asn Arg 


Arg Glu Thr Gly 


305 






310 


Lys 


Glu 


Val Tyr 


Gin Arg Arg His 








325 


Ser 


Ala 


Glu Glu 


Leu Glu Arg Lys 






340 




Lys 


Trp 


Arg Glu 


Glu Glu Arg Leu 






355 


360 


Asp 


Glu 


Glu Arg 


Glu Gin Arg Leu 




370 




375 


Lys 


Phe 


lie His 


Arg Mat Lys Leu 


365 






390 


Glu 


Asp 


Arg Val 


Lya Arg Asn lie 








405 


Ala 


Leu 


Glu Lys 


Asn Phe Met Lys 






420 





<210> 96 

<211> 394 

<212> PRT 

<213> Homo sapiens 



Mat 


Ser Leu Glu Lys. 


Lys Glu Lys 




170 


175 


Lys 


His Lys Lys His 


Lys His Arg 


185 




190 


Ser 


Glu Asp Glu His 


Ser Ala Gly 




205 




Ser 


Ser Pro Val Leu 


Ser Lys Val 




220 




Asn 


Ser Asp Arg Asn 


Gin Gly Leu 




235 


240 


Lys 


Arg Gly His Gly 


Met Lys Asn 


250 


255 


His 


Ser Pro Pro Arg 


His Ala Ser 


265 




270 


Ser 


Arg Asp Arg Arg 


Ser Arg Ser 




285 




Arg 


Pro Ser Lys Leu 


His Asn Ser 




300 




Gin 


Thr Arg Ser Pro 


Ser Pro Lys 




315 


320 


Ala 


Pro Gly Tyr Thr 


Arg Lys Leu 




330 


335 


Arg 


Gin Glu Met Met 


Glu Asn Ala 


345 




OCA 

350 


Asn 


lie Leu Lys Arg 


His Ala Lys 




365 




Glu 


Lys Leu Asp Ser 


Arg Asp Gly 




380 




Glu 


Ser Ala Ser Thr 


Ser Ser Leu 




395 


400 


Tyr 


Ser Leu Gin Arg 


Thr Ser Val 


410 


415 


Arg 






425 







<400> 96 

Met Phe Ser Val Phe Glu Glu lie Thr Arg lie Val Val Lys Glu Mat 
1 5 10 15 

Asp Ala Gly Gly Asp Met He Ala Val Arg Ser Leu Val Asp Ala Asp 
20 25 30 

Arg Phe Arg Cys Phe His Leu Val Gly Glu Lys Arg Thr Phe Phe Gly 

35 40 45 

Cys Arg His Tyr Thr Thr Gly Leu Thr Leu Met Asp He Leu Asp Thr 

50 55 60 

His Gly Asp Lys Trp Leu Asp Glu Leu Asp Ser Gly Leu Gin Gly Gin 
65 70 75 80 

Lys Ala Glu Phe Gin He Leu Asp Asn Val Asp Ser Thr Gly Glu Leu 
85 90 95 
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Ho Val Arg Leu Pro Lys Glu He Thr He Sor Gly Ser Phe Gin Gly 

100 " 105 110 

Phe His His Gin Lys He Lys He Ser Glu Asn Arg He Ser Gin Gin 

115 120 125 

Tyr Leu Ala Thr Leu Glu Asn Arg Lys Leu Lys Arg Glu Leu Pro Phe 

130 135 140 

Ser Phe Arg Ser He Asn Thr Arg Glu Asn Leu Tyr Leu Val Thr Glu 
145 150 155 160 

Thr Leu Glu Thr Val Lys Glu Glu Thr Leu Lys Ser Asp Arg Gin Tyr 

165 170 175 

Lys Phe Trp Ser Gin He Ser Gin Gly His Leu Ser Tyr Lys His Lys 

180 185 190 

Gly Gin Arg Glu Val Thr He Pro Pro Asn Arg Val Leu Ser Tyr Arg 

195 200 205 

Val Lys Gin Leu Val Phe Pro Asn Lys Glu Thr Met Arg Lys Ser Leu 

210 215 220 

Gly Ser Glu Asp Ser Arg Asn Met Lys Glu Lys Leu Glu Asp Met Glu 
225 230 235 240 

Ser Val Leu Lys Asp Leu Thr Glu Glu Lys Arg Lys Asp Val Leu Asn 

245 250 255 

Ser Leu Ala Lys Cys Leu Gly Lys Glu Asp He Arg Gin Asp Leu Glu 

260 265 270 

Gin Arg Val Ser Glu Val Leu He Ser Gly Glu Leu His Met Glu Asp 

275 280 285 

Pro Asp Lys Pro Leu Leu Ser Ser Leu Phe Asn Ala Ala Gly Val Leu 

290 295 300 

Val Glu Ala Arg Ala Lys Ala He Leu Asp Phe Leu Asp Ala Leu Leu 
305 310 315 320 

Glu Leu Ser Glu Glu Gin Gin Phe Val Ala Glu Ala Leu Glu Lys Gly 

325 330 335 

Thr Leu Pro Leu Leu Lys Asp Gin Val Lys Ser Val Met Glu Gin Asn 

340 345 350 

Trp Asp Glu Leu Ala Ser Ser Pro Pro Asp Met Asp Tyr Asp Pro Glu 

355 360 365 

Ala Arg He Leu Cys Ala Leu Tyr Val Val Val Ser He Leu Leu Glu 

370 375 380 

Leu Ala Glu Gly Pro Thr Ser Val Ser Ser 
390 



385 




<210> 


97 


<211> 


456 


<212> 


PRT 


<213> 


Homo 



<400> 97 

Met Glu Gly Pro Glu Gly Leu Gly Arg Lys Gin Ala Cys Leu Ala Met 

1 5 10 15 

Leu Leu His Phe Leu Asp Thr Tyr Gin Gly Leu Leu Gin Glu Glu Glu 

20 25 30 

Gly Ala Gly His He He Lys Asp Leu Tyr Leu Leu He Met Lys Asp 

35" 40 45 

Glu Ser Leu Tyr Gin Gly Leu Arg Glu Asp Thr Lou Arg Leu His Gin 
50 55 60 
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Leu Val Glu Thr Val Glu Lou Lys ilo Pro Glu Glu Asn Gin Pro Pro 
65 70 75 80 

Ser Lys Gin Val Lys Pro Leu Phe Arg His Phe Arg Arg lie Asp Ser 

85 90 95 

Cys Leu Gin Thr Arg Val Ala Phe Arg Gly 8er Asp Glu lie Phe Cys 

100 105 110 

Arg Val Tyr Met Pro Asp His Ser Tyr Val Thr He Arg Ser Arg Leu 

115 120 125 

Ser Ala Ser Val Gin Asp He Leu Gly Ser Val Thr Glu Lys Leu Gin 

130 135 140 

Tyr Ser Glu Glu Pro Ala Gly Arg Glu Asp Ser Leu He Leu Val Ala 
145 150 155 160 

Val Ser Ser Ser Gly Glu Lys Val Leu Leu Gin Pro Thr Glu Asp Cys 

165 170 175 

Val Phe Thr Ala Leu Gly He Asn Ser His Leu Phe Ala Cys Thr Arg 

180 185 190 

Asp Ser Tyr Glu Ala Leu Val Pro Leu Pro Glu Glu He Gin Val Ser 

195 200 205 

Pro Gly Asp Thr Glu He His Arg Val Glu Pro Glu Asp Val Ala Asn 

210 215 220 

His Leu Thr Ala Phe His Trp Glu Leu Phe Arg Cys Val His Glu Leu 
225 230 235 240 

Glu Phe Val Asp Tyr Val Phe His Gly Glu Arg Gly Arg Arg Glu Thr 

245 250 255 

Ala Asn Leu Glu Leu Leu Leu Gin Arg Cys Ser Glu Val Thr His Trp 

260 265 270 

Val Ala Thr Glu Val Leu Leu Cys Glu Ala Pro Gly Lys Arg Ala Gin 

275 280 285 

Leu Leu Lys Lys Phe He Lys He Ala Ala Leu Cys Lys Gin Asn Gin 

290 295 300 

Asp Leu Leu Ser Phe Tyr Ala Val Val Met Gly Leu Asp Asn Ala Ala 
305 310 315 320 

Val Ser Arg Leu Arg Leu Thr Trp Glu Lys Leu Pro Gly Lys Phe Lys 

325 330 335 

Asn Leu Phe Arg Lys Phe Glu Asn Leu Thr Asp Pro Cys Arg Asn His 

340 345 350 

Lys Ser Tyr Arg Glu Val He Ser Lys Met Lys Pro Pro Val He Pro 

355 360 365 

Phe Val Pro Leu He Leu Lys Asp Leu Thr Phe Leu His Glu Gly Ser 

370 375 380 

Lys Thr Leu Val Asp Gly Leu Val Asn He Glu Lys Leu His Ser Val 
385 390 395 400 

Ala Glu Lys Val Arg Thr He Arg Lys Tyr Arg Ser Arg Pro Leu Cys 

405 410 415 

Leu Asp Mat Glu Ala Ser Pro Asn His Leu Gin Thr Lys Ala Tyr Val 

420 425 430 

Arg Gin Phe Gin Val He Asp Asn Gin Asn Leu Leu Phe Glu Leu Ser 

435 440 445 

Tyr Lys Leu Glu Ala Asn Ser Gin 
450 455 

<210> 98 

<211> 715 

<212> PRT 

<213> Homo sapiens 
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<400> 98 

Mat Ser Gin Val Met Ser Ser Pro Leu Leu Ala Gly Gly His Ala Val 

15 10 15 

Ser Leu Ala Pro Cys Asp Glu Pro Arg Arg Thr Leu His Pro Ala Pro 

20 25 30 

Ser Pro Ser Leu Pro Pro Gin Cys Ser Tyr Tyr Thr Thr Glu Gly Trp 

35 40 45 

Gly Ala Gin Ala Leu Met Ala Pro Val Pro Cys Met Gly Pro Pro Gly 

50 55 60 

Arg Leu Gin Gin Ala Pro Gin Val Glu Ala Lys Ala Thr Cys Phe Leu 
65 70 75 80 

Pro Ser Pro Gly Glu Lys Ala Leu Gly Thr Pro Glu Asp Leu Asp Ser 

85 90 95 

Tyr He Asp Phe Ser Leu Glu Ser Leu Asn Gin Met He Leu Glu Leu 

100 105 110 

Asp Pro Thr Phe Gin Leu Leu Pro Pro Gly Thr Gly Gly Ser Gin Ala 

115 120 125 

Glu Leu Ala Gin Ser Thr Met Ser Mat Arg Lys Lys Glu Glu Ser Glu 

130 135 140 

Ala Leu Asp He Lys Tyr lie Glu Val Thr Ser Ala Arg Ser Arg Cys 
145 150 155 160 

His Asp Trp Pro Gin His Cys Ser Ser Pro Ser Val Thr Pro Pro Phe 

165 170 175 

Gly Ser Pro Arg Ser Gly Gly Leu Leu Leu Ser Arg Asp Val Pro Arg 

180 185 190 

Glu Thr Arg Ser Ser Ser Glu Ser Leu He Phe Ser Gly Asn Gin Gly 

195 200 205 

Arg Gly His Gin Arg Pro Leu Pro Pro Ser Glu Gly Lou Ser Pro Arg 

210 215 220 

Pro Pro Asn Ser Pro Ser He Ser He Pro Cys Mat Gly Ser Lys Ala 
225 230 235 240 

Ser Ser Pro His Gly Leu Gly Ser Pro Leu Val Ala Ser Pro Arg Leu 

245 250 255 

Glu Lys Arg Leu Gly Gly Leu Ala Pro Gin Arg Gly Ser Arg He Ser 

260 265 270 

Val Leu Ser Ala Ser Pro Val Ser Asp Val Ser Tyr Mat Phe Gly Ser 

275 280 285 

Ser Gin Ser Leu Leu His Ser Ser Asn Ser Ser His Gin Sar Ser Ser 

290 295 300 

Arg Ser Leu Glu Ser Pro Ala Asn Ser Ser Ser Ser Leu His Ser Leu 
305 310 315 320 

Gly Ser Val Ser Leu Cys Thr Arg Pro Ser Asp Phe Gin Ala Pro Arg 

325 330 335 

Asn Pro Thr Leu Thr Met Gly Gin Pro Arg Thr Pro His Ser Pro Pro 

340 345 350 

Leu Ala Lys Glu His Ala Ser He Cys Pro Pro Ser He Thr Asn Ser 

355 360 365 

Met Val Asp He Pro lie Val Leu lie Asn Gly Cys Pro Glu Pro Gly 

370 375 380 

Ser Ser Pro Pro Gin Arg Thr Pro Gly His Gin Asn Ser Val Gin Pro 
385 390 395 400 

Gly Ala Ala Ser Pro Ser Asn Pro Cys Pro Ala Thr Arg Ser Asn Ser 

405 410 415 

Gin Thr Leu Ser Asp Ala Pro Pho Thr Thr Cys Pro Glu Gly Pro Ala 

420 425 430 

Arg Asp Met Gin Pro Thr Met Lys Phe Val Met Asp Thr Ser Lys Tyr 
435 440 445 
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Trp Phe Lys Pro Asn He Thr Arg Glu Gin Ala He Glu Leu Leu Arg 

450 455 460 

Lys Glu Glu Pro Gly Ala Phe Val He Arg Asp Ser Ser Ser Tyr Arg 
465 470 475 480 

Gly Ser Phe Gly Leu Ala Leu Lys Val Gin Glu Val Pro Ala Ser Ala 

485 490 495. 

Gin Asn Arg Pro Gly Glu Asp Ser Asn Asp Leu He Arg His Phe Leu 

500 505 510 

He Glu Ser Ser Ala Lys Gly Val His Leu Lys Gly Ala Asp Glu Glu 

515 520 525 

Pro Tyr Phe Gly Ser Leu Ser Ala Phe Val Cys Gin His Ser He Met 

530 535 540 

Ala Leu Ala Leu Pro Cys Lys Leu Thr He Pro Gin Arg Glu Leu Gly 
545 550 555 560 

Gly Ala Asp Gly Ala Ser Asp Ser Thr Asp Ser Pro Ala Ser Cys Gin 

565 570 575 

Lys Lys Ser Ala Gly Cys His Thr Leu Tyr Leu Ser Ser Val Ser Val 

580 585 590 

Glu Thr Leu Thr Gly Ala Leu Ala Val Gin Lys Ala He Ser Thr Thr 

595 600 605 

Phe Glu Arg Asp He Leu Pro Thr Pro Thr Val Val His Phe Glu Val 

610 615 620 

Thr Glu Gin Gly He Thr Leu Thr Asp Val Gin Arg Lys Val Phe Phe 
625 630 635 640 

Arg Arg His Tyr Pro Leu Thr Thr Leu Arg Phe Cys Gly Met Asp Pro 

645 650 655 

Glu Gin Arg Lys Trp Gin Lys Tyr Cys Lys Pro Ser Trp lie Phe Gly 

660 665 670 

Phe Val Ala Lys Ser Gin Thr Glu Pro Gin Glu Asn Val Cys His Leu 

675 680 685 

Phe Ala Glu Tyr Asp Mat Val Gin Pro Ala Ser Gin Val He Gly Leu 

690 695 700 

Val Thr Ala Leu Leu Gin Asp Ala Glu Arg Met 
705 710 715 

<210> 99 

<211> 35 

<212> DHA 

<213> Artificial sequence 



<220> 

<223> PCR primer 

<400> 99 

ccatatataa aaccactgtc ctgtcctttg tggct 

<210> 100 

<211> 26 

<212> DHA 

<213> Artificial sequence 
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<220> 



<223> PGR primer 
<400> 100 

cccccatctg tctgtctata tttgtc 



<210> 101 

<211> 22 

<212> DHA 

<213> Artificial sequence 



<220> 



<223> PCR primer 
<400> 101 

tgcctacgct gacgactatg tg 
<210> 102 
<211> 25 
<212> DNA 



<213> Artificial sequence 



<220> 

<223> PCR primer 
<400> 102 

tttggttttc tacaactgtt gctat 
<210> 103 



<211> 19 
<212> DNA 



<213> Artificial sequence J 



<220> 

<223> PCR primer 
<400> 103 

gggctccaca caccagatg 
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<210> 104 

<211> 21 

5 

<212> DNA 

<213> Artificial sequence 

10 

<220> 

<223> PCR primer 

15 <400> 104 

acgctctgag caccctctac a 

<210> 105 

<211> 31 

20 

<212> DNA 

<213> Artificial sequence 

25 

<220> 

<223> PGR primer 

<400> 105 

30 

tgtcacaggg actgaaaacc tctcctcatg t 

<210> 106 

<211> 17 

35 

<212> DNA 

<213> Artificial sequence 

40- 

<220> 

<223> PCR primer 

<400> 106 
cccaaggcca cgagctt 

<210> 107 

<211> 24 

50 

<212> DNA 

<213> Artificial sequence 

55 
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<220> 



<223> PCR primer 

<400> 107 

tgttgctctc ttaacgaatc gaaa 

<210> 108 

<211> 29 

<212> DMA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 108 

ctggtcaaac aaactctctg aacccotcc 

<210> 109 

<211> 20 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 109 

tggtgaggaa aagcggacat 

<210> 110 

<211> 21 

<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 110 

ctggcttgga ggacagtgaa g 

<210> 111 



<211> 24 
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<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 111 

ccaagccctc cccatcccat gtat 

<210> 112 

<211> 21 

<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 112 

gaggtgtcgt accgcgttct a 

<210> 113 

<211> 21 

<212> DKA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 113 

ccgttctgct cttccctgtc t 

<210> 114 

<211> 23 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 114 

ccagacccgc ttcactgacc tgc 
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<210> 115 

<211> 20 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PGR primer 

<400> 115 

cgcctgtact tcagcatgga 

<210> 116 

<211> 18 

<212> DMA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 116 
gcggttcagc tggtggaa 

<210> 117 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 117 

accccgaggc atcaccacaa atcat 

<210> 118 

<211> 23 

<212> DNA 

<213> Artificial sequence 
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<220> 

<223> PCR primer 
<400> 118 

agttctgcct ctctgacaac cat 
<210> 119 
<211> 23 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 119 

taggctcaga gtcagaccca aac 
<210> 120 
<211> 21 
<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 120 

ccctcgtggg cttgtgctcg g 
<210> 121 
<211> 21 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 121 

aagccgccag ttcatctttt t 
<210> 122 
<211> 25 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 122 

cttgtggttc aagtcaaatg ttcag 
<210> 123 
<211> 21 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PGR primer 
<400> 123 

tctgcctgcg ctctcgtcgg t 
<210> 124 
<211> 18 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 124 
gggctgggca cctgactt 

<210> 125 

<211> 20 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 125 

cccaacaagg gtcccagact 
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<210> 126 

<211> 17 

<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 126 
cggcgcattg agcggcg 

<210> 127 

<211> 20 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 127 

cccaagggac ttcgtgaatg 

<210> 128 

<211> 21 

<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 128 

ggcgatccct gatgacaagt a 

<210> 129 

<211> 29 

<212> DHA 

<213> Artificial sequence 
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<220> 

<223> PGR primer# 

<400> 129 

agcaccaact gtgaaccagg tacaatggc 

<210> 130 

<211> 19 

<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 130 

gagggaggct ctgctttgg 

<210> 131 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 131 

tcacaactag cgggtgagga g 

<210> 132 

<211> 21 

<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 132 

tgcagaggaa cggcgtgagc g 

<210> 133 



<211> 22 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 133 

tgaggtttcc tcccaaatcg ta 

<210> 134 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 134 

cagctcaagg gaagctgtca tc 

<210> 135 

<2li> 24 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 135 

cccccacatg ttccccaaga tgct 

<210> 136 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 136 

ggaggcgcta aaggtctacg t 
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<210> 137 

<211> 21 

<212> DNA 

<213> Artificial saquenco 



<220> 

<223> PCR primor 
<400> 137 

tgatgcttcg caggtcagta a 
<210> 138 
<211> 26 



<212> DNA 

<213> Artificial sequence 



<220> 



<223> PCR primer 
<400> 138 

ctcctgcccc tcctaaagct gaagcc 
<210> 139 
<211> 17 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 139 
ggacgcgtgg gcttttc 

<210> 140 



<211> 20 
<212> DNA 

<213> Artificial sequence 
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<220> 

<223> PCR primer 

<400> 140 

tgtggctgtg gacacctttc 

<210> 141 

<211> 25 

<212> DHA 

<213> Artificial sequence 
<220> 

<223> PGR primer 

<400> 141 

ccacaagctg aaggcagaca aggcc 

<210> 142 

<211> 20 

<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 142 

gcggattctc atggaacaca 

<210> 143 

<211> 20 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 143 

ggtcagccag gagcttcttg 

<210> 144 

<211> 23 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> PGR primer 

<400> X44 

accaccttgc gcaggttgtc cag 

<210> 145 

<211> 18 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PGR primer 

<400> 145 
cgcatgcacg acctgaac 

<210> 146 

<211> 23 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 146 

gtctcgatct tggacagctt ctg 

<210> 147 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PGR primer 

<400> 147 

acactgtcca cacggcccga gg 



210 
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<210> 148 

<211> 21 

<212> DKA 

<213> Artificial sequence 
<220> 

<223> PGR primer 

<400> 148 

ctgggcagaa tggaaggatc t 

<210> 149 

<211> 22 

<212> DKA. 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<40O> 149 

gggactctag cagacccaca ct 

<210> 150 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 150 

cacccacctg gattccctgt tc 

<210> 151 

<211> 23 

<212> DNA 

<213> Artificial sequence 



211 
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10 



15 



<220> 

<223> PCR primer 

<400> 151 

ccttcagaca ggcgtagatg atg 23 

<210> 152 

<211> 29 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> PCR primer 

<400> 152 

20 gggtattatt tctttattag gtgacactt 29 

<210> 153 

<211> 30 

25 

<212> DKA 



<213> Artificial sequence 

30 

<220> 

<223> PCR primer 

<400> 153 
35 ttccctaagg ctttcagtac ccaggatctg 

<210> 154 

<211> 16 

40 <212> DNA 

<213> Artificial sequence 

45 

<220> 

<223> PCR primer 

<400> 154 
ccagcttggc cctttcct 

50 

<210> 155 

<211> 23 

55 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 155 

gaatgggtcg cttttgttct tag 

<210> 156 

<211> 22 

<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 156 

tcacggacct cagcctgccc ct 

<210> 157 

<211> 21 

<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 157 

tggtgaaggt gtcagccatg t 

<210> 158 

<211> 21 

<212> DHA 

<213> Artificial sequence 
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<220> 

<223> PCR primer 
<400> 158 

tcagagtgca gcaatggctt t 
<210> 159 
<211> 20 
<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 159 

acctccttcc ccagotcccc 
<210> 160 
<211> 24 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 160 

ggcaacatct tacttgtcct ttga 
<210> 161 
<211> 25 
<212> DHA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 161 

ccaaggaagc acagacaact atttc 
<210> 162 
<211> 30 
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10 



30 



50 



55 



<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 162 

tcctccctat ccatggcact aaaccacttc 30 



<210> 163 

15 <211> 19 

<212> DNA 

<213> Artificial sequence 

20 

<220> 

<223> PCR primer 

25 <400> 163 

tgggcaaggg ctcctatct 19 

<210> 164 

<211> 21 

<212> DNA 

<213> Artificial sequence 

35 

<220> 

<223> PCR primer 

<400> 164 

40 gttacccctg gcagacgtat g 21 

<210> 165 

<211> 31 

45 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> PCR primer 
<400> 165 

tgcctctgag tctgaatctc ccaaagagag a 31 
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<210> 166 

<211> 31 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 166 

gag tag t tat gtgattattt cagctcttga c 

<210> 167 

<211> 21 

<212> DMA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<40O> 167 

tcaaatgttg tccccgagtc t 

<210> 168 

<211> 34 

<212> DMA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 168 

cagaaattcg gaagacagaa ctattgtcat gcct 

<210> 169 

<211> 27 

<212> DMA 

<213> Artificial sequence 



216 



<220> 



<223> PGR primer 
<400> 169 

gattagtaac ccatagcagt tgaaggt 
<210> 170 
<211> 26 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 170 

atttactgac ggtggtctga acatac 
<210> 171 
<211> 31 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 171 

tgacagactc caaatcacaa gcacagtcaa c 
<210> 172 
<211> 25 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 172 

tgatggtttg gaggaaagtt tattt 
<210> 173 
<211> 24 
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DNA 

Artificial sequence 
<220> 

<223> PGR primer 
<40O> 173 

tttggttggg tctttagagg aatc 
<210> 174 
<211> 24 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 174 

tgccaaccat gcatcaggta gccc 
<210> 175 
<211> 20 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PGR primer 
<400> 175 

cagctcacct ggcaacttca 
<210> 176 
<211> 20 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 
<400> 176 

cctgattttc ccagcgatgt 



<212> 
<213> 
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<210> 177 

<211> 19 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 177 

cgccgctccc ggttctgct 

<210> 178 

<211> 20 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 178 

tggccaagcg taagctgatt 

<210> 179 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> PCR primer 

<400> 179 

gctgcagtga tcggatcatc t 

<210> 180 

<211> 22 

<212> DNA 

<213> Artificial Sequence 



219 



<220> 



<223> MLLT6 

<400> X80 

caeca tggag cocatcgtgc tg 

<210> 181 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> MLLT6 for 

<400> 181 

atocccgagg tgcaatttg 

<210> 182 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> MLLT6 rev 

<400> 182 

agegatcatg aggcaegtae t 

<210> 183 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> ZNF144 

<400> 183 

cctgccagag ataggagacc cagacagct 

<210> 184 

<211> 19 
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w 



30 



50 



<212> DMA 

<213> Artificial Sequence 
<220> 

<223> ZNF144 for 
<400> 184 

atccccctga gccttttca W 



<210> 185 

15 <211> 19 

<212> DNA 

<213> Artificial Sequence 

20 - 
<220> 

<223> ZNF144 rev 

25 <400> 185 

cagcctctgg tcccaccat 19 

<210> 186 

<211> 28 

<212> DNA 

<213> Artificial Sequence 

35 ■ 

<220> 

<223> PIP5K2B 

<400> 186 

40 tgatcatcaa ttccaaacct ctcccgaa 28 

<210> 187 

<211> 19 

45 

<212> DNA 

<213> Artificial Sequence 



<220> 



<223> PIP5K2B for 
<400> 187 
55 ccccatggtg ttccgaaac 
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<210> 188 

<211> 19 

<212> DHA 

<213> Artificial Sequence 
<220> 

<223> PIP5K2B rev 

<400> 188 

tgccaggagc ctccatacc 19 

<210> 189 

<211> 29 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> TEM7 
<400> 189 

cagccttcta aaacacaatg tattcatgt 29 

^ 

<210> 190 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> TEM7 for 

<400> 190 

4 5 cctgaactta atggtagaat tea a agate 29 

<210> 191 

<211> 27 

50 

<212> DHA 

<213> Artificial Sequence 

55 



10 



15 



20 



25 



30 



35 
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<220> 

<223> TEM7 rev 
<400> 191 

tattaacact gagaatccat gcagaga 
<210> 192 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> ZNFN1A3 
<400> 192 

tatctggtct cagggattgc tcctatgtat tcagc 
<210> 193 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> ZKFN1A3 for 
<400> 193 

capagagccc tgctgaagtg 
<210> 194 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> ZNFN1A3 rev 
<400> 194 

gcgaggtcat tggtttttag aaa 
<210> 195 
<211> 22 



223 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> WIRE 
<400> 195 

ctgtgatccg aaatggtgcc ag 
<210> 196 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> WIRE for 
<400> 196 

ccgtctccac atccaaacct 
<210> 197 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> WIRE rev 
<400> 197 

acccatgcat tcggtatggt 
<210> 198 
<211> 21 
<212> DHA 

<213> Artificial Sequence 
<220> 

<223> PSMB3 
<400> 198 

agtggcacct gcgccgaaca a 



224 
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<210> 199 

<211> 21 

<212> ,DHA 

<213> Artificial Sequence 
<220> 

<223> PSMB3 for 

<400> 199 

ccccatggtg actgatgact t 

<210> 200 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PSMB3 rev 

<400> 200 

ccagagggac tcacacattc c 

<210> 201 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> MGC9753 

<400> 201 

ccagaaactt tccatcccaa aggcagtct 

<210> 202 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> MGC9753 for 
<400> 202 

ctgccccaca ggaatagaat g 
<210> 203 
<2U> 23 
<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> HGC9753 rev 
<400> 203 

aaaaatccag tctgcttcaa cca 
<210> 204 
<211> 20 
<212> DNA 

<2X3> ARTIFICIAL SEQUENCE 
<220> 

<223> 0RMDL3 
<400> 204 

agctgcccca gctccacgga 
<210> 205 
<211> 21 
<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> ORMDL3 for 
<400> 205 

tccctgatga gcgtgcttat c 
<2X0> 206 
<211> 28 
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<212> DHA 

<213> ARTIFICIAL SEQUENCE 



<220> 

<223> ORMDL3 rev 

<400> 206 

tctcagtact tattgattcc aaaaatco 

<210> 207 

<211> 25 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<220> 

<223> MGC15482 

<400> 207 

tccagtggaa gcaaccccag tgttc 

<210> 208 

<21X> 25 

<212> DKA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> MSC15482 for 

<400> 208 

cacttctaga gctaccgtgg agtct 

<210> 209 

<211> 22 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<220> 

<223> MGC15482 rev 
<400> 209 

ccctcacttt gtaacccttg ct 
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<210> 210 

<211> 20 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> PPP1R1B 

<400> 210 

cagcgtggcg caacaaccca 

<210> 211 

<211> 21 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> PPP1R1B for 

<400> 211 

gggattgttt cgccacacat a 

<210> 212 

<211> 20 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> PPP1R1B rev 

<400> 212 

ccgatgttaa ggcccatagc 

<210> 213 

<211> 27 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



228 
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<220> 

<223> MGC14832 

<400> 213 

fcaaaatgtcc ggccaacatg agttccc 

<210> 214 

<211> 17 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> MGC14832 for 

<400> 214 
cgcagtgcct ggcacat 

<210> 215 

<211> 20 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> MGC14832 rev 

<400> 215 

gacaccccct gacctatgga 

<210> 216 

<211> 25 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> LCX:51242 

<400> 216 

cagtgacctc tcccgttccc ttgga 

<210> 217 

<211> 20 



229 
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<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> LOC51242 for 

<400> 217 

tgggtccctg tgtcctcttc 

<210> 218 

<211> 20 

<212> DHA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> LOC51242 for 

<400> 218 

agggtcagga gggagaaaac 

<210> 219 

<211> 26 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> FLJ20291 

<400> 219 

ccagtgccca cccgttaaag agtcaa 

<210> 220 

<211> 24 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> FLJ20291 for 

<400> 220 

ttgtgggaca ctcagtaact ttgg 
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<210> 221 

<211> 20 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> FLJ20291 rev 

<400> 221 

acaagcactc ccaccgagat 

<210> 222 

<211> 24 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> PR02521 

<400> 222 

agtctgtcct cactgccatc gcca 

<210> 223 

<211> 21 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> PR02521 for 

<400> 223 

aagcctctgg gttttccctt t 

<210> 224 

<211> 20 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



231 
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<220> 

<223> PR02521 rev 

<400> 224 

cccactggtg acaggatggt 

<210> 225 

<211> 23 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> LINK-GEFII 

<400> 225 

catctgacat ctttcccgtg gag 

<210> 226 

<211> 21 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> LINK-GEFII for 

<400> 226 

ctttgcacga tgtctcaacc a 

<210> 227 

<211> 18 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> LINK-GEFII rev 

<400> 227 
tttcccgtgg agcaggaa 

<210> 228 

<211> 26 
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<213> ARTIFICIAL SEQUENCE 



<220> 

<223> CTEN 

<400> 228 

ccgccgccta atatgcaaca ttaggg 

<210> 229 

<211> 23 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 
<220> 

<223> CTEN for 

<400> 229 

cgagtattcc aaagctggta teg 

<210> 230 

<211> 24 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<220> 

<223> CTEN rev 

<400> 230 

atcacagaga gatggecett atct 

<210> 231 

<211> 25 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> D17S946 forward primer 
<400> 231 

acagtctatc aagcagaaaa atcct 
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<210> 232 

<211> 16 

<212> DHA 

<213> Artificial Sequence 
<220> 

<223> D17S946 reverse primer 

<400> 232 
tgccgtgcca gagaga 

<210> 233 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1181 forward primer 

<400> 233 

gacaacagag cgagactccc 

<210> 234 

<211> 20 

<212> DHA 

<213> Artificial Sequence 
<220> 

<223> D17S1181 reverse primer 

<400> 234 

gcccagcctg tcacttattc 

<210> 235 

<211> 18 

<212> DNA 

<213> Artificial Sequence 



234 
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<220> 

<223> D17S2026 forward primer 
<400> 235 
tggtcattcg acaacgaa 

<210> 236 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S2026 reverse primer 

<400> 236 
cagcattgga tgcaatco 

<210> 237 

<211> 20 

<212> DNA. 

<213> Artificial Sequence 
<220> 

<223> D17S838 forward primer 

<400> 237 

ctccagaatc cagaccatga 

<210> 238 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S838 reverse primer 

<400> 238 

aggacagtgt gtagccctte 

<210> 239 

<211> 20 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S250 forward primer 

<400> 239 

ggaagaatca aatagacaat 

<210> 240 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S250 reverse primer 

<400> 240 

gctggccata tatatattta aacc 

<210> 241 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1818 forward primer 

<400> 241 

cataggtatg ttcagaaatg tga 

<210> 242 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1818 reverse primer 

<400> 242 
tgcctactgg aaaccaga 
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<210> 243 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S614 forward primer 

<400> 243 

aaggggaagg ggctttcaaa get 

<210> 244 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S614 reverse primer * 
<220> 

<221> misc feature 

<222> (1)."(1) 

<223> n=a, c, g or t 

<400> 244 

nggaggttgc agtgagccaa gat 
<210> 245 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S2019 forward primer 
<400> 245 

caaaagctta tgatgctcaa acc 
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<210> 246 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S20i9 reverse primer 

<400> 246 

ttgtttccct ttgactttct ga 

<210> 247 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S608 forward primer 

<400> 247 

taggttcacc tctcattttc ttcag 

<210> 248 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S608 reverse primer 
<220> 

<221> misc_feature 

<222> (17) . , (17) 

<223> n-a, c, g or t 

<400> 248 

CTtetciCTCTtrst ttatoonoct tnto 
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<211> 20 

<212> DHA 

<213> Artificial Sequence 
<220> 

<223> D17S1655 forward primer 

<400> 249 

cggaccagag tgttccatgg 

<210> 250 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1655 reverse primer 

<400> 250 

gcatacagca ccctctacct 

<210> 251 

<211> 25 

<212> DHA 

<213> Artificial Sequence 
<220> 

<223> D17S2147 forward primer 

<400> 251 

aggggagaat aaataaaatc tgtgg 

<210> 252 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> D17S2147 reverse primer 

<400> 252 

caggagtgag acactctcca tg 

<210> 253 

<211> 22 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> D17S754 forward primer 

<400> 253 

tggattcact gactcagcct gc 

<210> 254 

<211> 22 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> D17S754 reverse primer 

<400> 254 

gcgtgtctgt ctccatgtgt gc 

<210> 255 

<211> 18 

<212> DNA 

<2X3> Artificial Sequence 



<220> 

<223> D17S1614 forward primer 
<400> 255 
tccccaatga cggtgatg 

<210> 256 

<2U> 20 



240 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1814 reverse primer 

<400> 256 

ctggaggttg gcttgtggat 

<210> 257 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S2007 forward primer 

<400> 257 
ggtcccacga atttgctg 

<210> 258 

<211> 20 

<212> DMA 

<213> Artificial Sequence 
<220> 

<223> D17S2007 reverse primer 

<400> 258 

ccacccagaa aaacaggaga 

<210> 259 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1246 forward primer 

<400> 259 

tcgatctcct gaccttgtga 
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10 



20 



25 



30 



35 



40 



45 



50 



55 



<210> 260 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1246 reverse primer 
<400> 260 

ttgtcacccc attgcctttc 
<210> 261 
<211> 21 
<212> DNA 

<213> Artificial Sequence \ 
<220> 

<223> D17S1979 forward primer 
<400> 261 

ccttggatag attcagctcc c 
<210> 262 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1979 reverse primer 
<400> 262 

ottgtccctt ctcaatcctc c 
<210> 263 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> D17S1984 forward primer 

<400> 263 

ttaagcaagg ttttaattaa gctgc 

<210> 264 

<211> 21 

<212> DHA 

<213> Artificial Sequence 
<220> 

<223> D17S1984 reverse primer 

<400> 264 

gattacagtg ctccctctcc c 

<210> 265 

<211> 22 

<212> DHA 

<213> Artificial Sequence 
<220> 

<223> G11580 forward primer 

<400> 265 

ggttttaatt aagctgcatg gc 

<210> 266 

<211> 21 

<212> DHA 

<213> Artificial Sequence 
<220> 

<223> G11580 reverse primer 

<400> 266 

gattacagtg ctccctctcc c 

<210> 267 

<211> 20 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1867 forward primer 
<400> 267 

agtttgacac tgaggctttg 
<210> 268 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1867 reverse primer 
<400> 268 

tttagacttg gtaactgccg 
<210> 269 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1788 forward primer 
<400> 269 

tgcagatgcc taagaacttt tcag 
<210> 270 
<2U> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1788 reverse primer 
<400> 270 

gccatgatct cccaaagcc 
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<210> 271 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1836 forward primer 

<400> 271 
tcgaggttat ggtgagcc 

<210> 272 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1836 reverse primer 

<400> 272 

aaactgtgtg tgtcaaagga tact 

<210> 273 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1787 forward primer 

<400> 273 

gctgatctga agccaatga 

<210> 274 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> D17S1787 reverse primer 
<400> 274 

tacatgaagg catggtctg 

<210> 275 

<211> 23 

<212> DHA 

<2X3> Artificial Sequence 
<220> 

<223> D17S1660 forward primer 

<400> 275 

ctaatataat cctgggcaca tgg 

<210> 276 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1660 reverse primer 

<400> 276 
gctgcggacc agacagat 

<210> 277 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S2154 forward primer 

<400> 277 

gataaaaaca agcactggct cc 

<210> 278 



<211> 20 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S2154 reverse primer 

<400> 278 

cccacggctt tcttgatcta 

<210> 279 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1955 forward primer 

<400> 279 

tgtaatgtaa gccccatgag g 

<210> 280 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1955 reverse primer 

<400> 280 

cactcaactc aacagtctaa aggtg 

<210> 281 

<211> 25 

<212> DHA 

<213> Artificial Sequence 
<220> 

<223> D17S2098 forward primer 

<400> 281 

gtgagttcaa gcatagtaat tatcc 
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<210> 282 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S2098 reverse primer 

<400> 282 

attcagcctc agttcactgc ttc 

<210> 283 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S518 forward primer 

<400> 283 

gatccagtgg agactcagag 

<210> 284 

<2U> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> DX7S518 reverse primer 

<400> 284 

tagtctctgg gacacccaga 

<210> 285 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> D17S518 forward primer 

<400> 285 

attcctgagt gtctaccctg ttgag 

<210> 286 

<211> 17 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S518 reverse primer 

<400> 286 
actgactgcg ocactgc 

<210> 287 

<2X1> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D11S43S8 forward primer 

<400> 287 

tcgagaagga caaaatcacc 

<210> 288 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D11S4358 reverse primer 

<400> 288 

gaacagggtt agtccattcg 

<210> 289 

<211> 19 



249 



EP 1 365 034 A2 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S964 forward primer 

<400> 269 

gttctttcct cttgtgggg 

<210> 290 

<2U> 19 

<212> DNA. 

<213> Artificial Sequence 
<220> 

<223> D17S964 reverse primer 

<400> 290 

agtcagctga gattgtgcc 

<210> 291 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D19S1091 forward primer 

<400> 291 

caagccaaga catcccagtt 

<210> 292 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D19S1091 reverse primer 

<400> 292 

ccccacacac agctcatatg 
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<210> 293 

<2U> 22 

<212> DMA 

<213> Artificial Sequence 
<220> 

<223> D17S1179 forward primer 

<400> 293 

ttttctctct cattccattg gg 

<210> 294 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1179 reverse primer 

<400> 294 

gcaacagagg gagactccaa 

<210> 295 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D10S2160 forward primer 

<400> 295 

tcccatcccg taagacctc 

<210> 296 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> D10S2160 reverse primer 

<400> 296 

tatggagtac ctactctatg ccagg 25 

<210> 297 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> D17S1230 forward primer 

<400> 297 

attcaaagct ggatcccttt 20 

<210> 298 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



w 



15 



20 



25 



30 

<220> 

<223> D17S1230 reverse primer 

<400> 298 

35 agctgtgaca aatgcctgta 20 

<210> 299 

40 <211> 20 

<212> DNA 

<213> Artificial Sequence 

45 

<220> 

<223> D17S1338 forward primer 

50 <400> 299 

tcacctgaga ttgggagacc 20 

<210> 300 

<211> 18 

55 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S1338 reverse primer 
<400> 300 
aagatggggc aggaatgg 

<210> 301 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S2011 forward primer 

<400> 301 

tcactgtcct ccaagccag 

<210> 302 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S2011 reverse primer 

<400> 302 

aaacaccaca ctctcccctg 

<210> 303 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S2011 forward primer 

<400> 303 

ttcttgggct tcccgtagcc 
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<210> 304 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S2011 reverse primer 

<400> 304 

ggggcagacg acttctcctt 

<210> 305 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> D17S2038 forward primer 

<400> 305 

ggggatacaa cctttaaagt tec 

<210> 306 

<211> 25 

<212> DMA 

<213> Artificial Sequence 
<220> 

<223> D17S2038 reverse primer 

<400> 306 

attcacctaa tgaggattct tcttt 

<210> 307 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
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10 



15 



20 



25 



30 



40 



45 



50 



55 



<220> 

<223> D17S2091 forward primer 

<400> 307 

gctgaaatag ccatcttgag ctac 24 

<210> 308 

<211> 23 

<212> DMA 

<213> Artificial Sequence 



<220> 

<223> D17S2091 reverse primer 

<400> 308 

tccgcatcct ttttaagagg cac 23 

<210> 309 

<211> 24 

<212> DMA 

<213> Artificial Sequence 



<220> 

<223> D17S649 forward primer 

<400> 309 

35 ctttcactct ttcagctgaa gagg 24 

<210> 310 

<211> 25 

<212> DMA 

<213> Artificial Sequence 



<220> 

<223> D17S649 reverse primer 
<400> 310 

tgacgtgcta tttcctgttt tgtct 25 
<210> 311 
<211> 18 
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<212> DKA. 

<213> Artificial Saquence 
<220> 
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<220> 

<223> D17S1190 reverse primer 
<400> 312 
caacacacta ccccaaaa 



Claims 

1 . A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of at least 2 markers 
characterized in that the markers are genes and fragments thereof or genomic nucleic acid sequences that are 
located on one chromosomal region which is altered in malignant neoplasia. 

2. A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of at least 2 markers 
characterized in that the markers are: 

a) genes that are located on one or more chromosomal region(s) which is/are altered in malignant neoplasia; 



b) 

i) receptor and ligand; or 

ii) members of the same signal transduction pathway; or 

iii) members of synergistic signal transduction pathways; or 

iv) members of antagonistic signal transduction pathways; or 

v) transcription factor and transcription factor binding site. 

3. The method of claim 1 or 2 wherein the malignant neoplasia is breast cancer, ovarian cancer, gastric cancer, colon 
cancer, esophageal cancer, mesenchymal cancer, bladder cancer or non-small cell lung cancer. 

4. The method of claim 1 or 2 wherein at least one chromosomal region is defined as the cytogenetic region: 1p13, 
1q32, 3p21-p24, 5p13-p14, 8q23-q 24 , 1 1q13, 12q13,17q12-q24 or20q13. 



.256 



EP 1 365 034 A2 



5. The method of claim 1 or 2 wherein at least chromosomal region is defined as the cytogenetic region 1 7q1 1 .2-21 .3 
and the malignant neoplasia is breast cancer, ovarian cancer, gastric cancer, colon cancer, esophageal cancer, 
mesenchymal cancer, bladder cancer or non-small cell lung cancer. 

6. The method of claim 1 or 2 wherein at least one chromosomal region is defined as the cytogenetic region 3p21 -24 
and the malignant neoplasia is breast cancer, ovarian cancer, gastric cancer, colon cancer esophageal cancer, 
mesenchymal cancer, bladder cancer or non-small cell lung cancer. 

7. The method of claim 1 or 2 wherein at least one chromosomal region is defined as the cytogenetic region 12q13 
and the malignant neoplasia is breast cancer, ovarian cancer, gastric cancer, colon cancer, esophageal cancer, 
mesenchymal cancer, bladder cancer or non-small cell lung cancer. 

8. A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of at least one marker 
whereby the marker is a VNTR, SNP, RFLP or STS characterized in that the marker is located on one chromo- 
somal region which is altered in malignant neoplasia due to amplification and the marker is detected in a cancerous 
and a non-cancerous tissue or biological sample of the same individual. 

9. The method of claim 8 wherein the marker is selected from the group consisting of the VNTRs: 

D17S946, D17S1181, D17S2026, D17S838, D17S250, D17S1818, D17S614, D17S2019, D17S608, 
D17S1655, D17S2147, D17S754, D17S1814, D17S2007, D17S1246, D17S1979, D17S1984, D17S1984, 
D17S1867, D17S1788, D17S1836, D17S1787, D17S1660, D17S2154, D17S1955, D17S2098, D17S518, 
D17S1851, D11S4358, D17S964, D19S1091, D17S1179, D10S2160, D17S1230, D17S1338, D17S2011, 
D17S1237, D17S2038, D17S2091 , D17S649, D17S1190 and M87506. 

10. The method of claim 8 wherein the marker is selected from the group consisting of the SNPs: 

rs2230698, rs2230700. rs1058808, rs1801200, rs903506, rs2313170, rs1136201, rs2934968, rs2172826, 
rs1810132,rs1801201, rs2230702 t rs2230701, rs1 126503, rs3471, rs 13695, rs471692, rs558068, rs1064288, 
rs1061692, rs520630, rs782774, rs565121, rs2586112, rs532299, rs2732786, rs1804539, rs1804538, 
rs1804537, rs1141364 rs12231, rs1132259 rs1132257, rs113225 rs113225 rs1132254, rs113225 rs1132268 
and rs1 1322 

1 1 . A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of at least one marker 
characterized in that the marker is selected from: 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 to 6, 
8, 9, 11 to 16, 18, 19, 21 to 26 or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) and encodes a polypeptide exhibiting the same biological function as specified for the respective 
sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (c) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (d) 

e) a purified polypeptide encoded by a polynucleotide or polynucleotide analog sequence specified in (a) to (e) 

0 a purified polypeptide comprising at least one of the sequences of SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 
44, 45, 47 to 52 or 76 to 98; 

are detected. 

12. A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of at least 2 markers 
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characterized in that at least 2 markers are selected from: 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 1 to 26 
or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) and encodes a polypeptide exhibiting the same biological function as specified for the respective 
sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c) 

e) a purified polypeptide encoded by a polynucleotide sequence or polynucleotide analog specified in (a) to (d) 

f) a purified polypeptide comprising at least one of the sequences of SEQ ID NO: 27 to 52 or 76 to 98 
are detected. 

1 3. The method of any of the claims 1 or 1 2 wherein the detection method comprises the use of PCR, arrays or beads. 

14. A diagnostic kit comprising instructions for conducting the method of any of claims 1 to 13. 

15. A composition for the prediction, diagnosis or prognosis of malignant neoplasia comprising: 

a) a detection agent for: 

i) any polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 
2 to 6, 8, 9, 11 to 16. 18,19,21 to 26 or 53 to 75; 

ii) any polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynu- 
cleotide specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the 
respective sequence in Table 2 or 3 

iii) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide 
specified in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the 
same biological function as specified for the respective sequence in Table 2 or 3 

iv) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic 
variation of a polynucleotide sequence specified in (a) to (c) 

v) a polypeptide encoded by a polynucleotide or polynucleotide analog sequence specified in (a) to (d); 

vi) a polypeptide comprising at least one of the sequences of SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 44, 
45, 47 to 52 or 76 to 98. 



b) at least 2 detection agents for at least 2 markers selected from: 

i) any polynucleotide comprising at least one of the sequences of SEQ ID NO: 1 to 26 or 53 to 75; 

ii) any polynucleotide which hybridizes under stringent conditions to a polynucleotide specified in (a) en- 
coding a polypeptide exhibiting the same biological function as specified for the respective sequence in 
Table 2 or 3 
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iii) a polynucleotide the sequence of which deviates from the polynucleotide specified in (a) and (b) due 
to the generation of the genetic code encoding a polypeptide exhibiting the same biological function as 
specified for the respective sequence in Table 2 or 3 

iv) a polynucleotide which represents a specific fragment, derivative or allelic variation of a polynucleotide 
sequence specified in (a) to (c) 

v) a polypeptide encoded by a polynucleotide sequence specified in (a) to (d); 

vi) a polypeptide comprising at least one of the sequences of SEQ ID NO: 27 to 52 or 76 to 98. 

16. An array comprising a plurality of polynucleotides or polynucleotide analogs wherein each of the polynucleotides 
is selected from: 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 1 to 26 
or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 
sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified, 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c) 

attached to a solid support. 

17. A method of screening for agents which regulate the activity of a polypeptide encoded by a polynucleotide or 
polynucleotide analog selected from the group consisting of: ' 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 to 6, 
8, 9, 11 to 16, 18, 19, 21 to 26 or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 
sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c); 

comprising the steps of: 

i) contacting a test compound with at least one polypeptide encoded by a polynucleotide specified in (a) to 
(d);and 

ii) detecting binding of the test compound to the polypeptide, wherein a test compound which binds to the 
polypeptide is identified as a potential therapeutic agent for modulating the activity of the polypeptide in order 
to prevent of treat malignant neoplasia. 

18. A method of screening for agents which regulate the activity of a polypeptide encoded by a polynucleotide or 
polynucleotide analog selected from the group consisting of: 
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a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 to 6, 
8, 9. 11 to 16, 18, 19, 21 to 26 or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 
sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c) 

comprising the steps of: 

i) contacting a test compound with at least one polypeptide encoded by a polynucleotide specified in (a) to 
(d); and 

ii) detecting the activity of the polypeptide as specified for the respective sequence in Table 2 or 3, wherein a 
test compound which increases the activity is identified as a potential preventive or therapeutic agent for 
increasing the polypeptide acitivity in malignant neoplasia, and wherein a test compound which decreases the 
activity of the polypeptide is identified as a potential therapeutic agent for decreasing the polypeptide activity 
in malignant neoplasia. 

19. A method of screening for agents which regulate the activity of a polynucleotide or polynucleotide analog selected 
from group consisting of; 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 to 6, 
8, 9, 11 to 16, 18, 19, 21 to 26 or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 
sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c) 

comprising the steps of: 

i) contacting a test compound with at least one polynucleotide or polynucleotide analog specified in (a) to (d), 
and 

ii) detecting binding of the test compound to the polynucleotide, wherein a test compound which binds to the 
polynucleotide is identified as a potential preventive or therapeutic agent for regulating the activity of the poly- 
nucleotide in malignant neoplasia. 

20. Use of 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 to 6, 
8, 9, 11 to 16, 18, 19, 21 to 26 or 53 to 75; 

b) a polynucleotide which hybridizes under stringent conditions to a polynucleotide or polynucleotide analog 
specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 
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sequence in Table 2 or 3; 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3; 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c); 

e) an antisense molecule targeting specifically one of the polynucleotide sequences specified in (a) to (d); 

f) a purified polypeptide encoded by a polynucleotide or polynucleotide analog sequence specified in (a) to (d) 

g) a purified polypeptide comprising at least one of the sequences of SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 

44, 45, 47 to 52 or 76 to 98; 

h) an antibody capable of binding to one of the polynucleotide specified in (a) to (d) or a polypeptide specified 
in (f) and (g); 

i) a reagent identified by any of the methods of claim 17 to 19 that modulates the amount or activity of a 
polynucleotide sequence specified in (a) to (d) or a polypeptide specified in (f) and (g); 

in the preparation of a composition for the prevention, prediction, diagnosis, prognosis or a medicament for the 
treatment of malignant neoplasia. 

21 . Use of claim 20 wherein the disease is breast cancer. 

22. A reagent that regulates the activity of a polypeptide selected from the group consisting of: 

a) a polypeptide encoded by any polynucleotide or polynucleotide analog comprising at least one of the se- 
quences of SEQ ID NO: 2 to 6, 8, 9, 11 to 16, 18, 19, 21 to 26 or 53 to 75; 

b) a polypeptide encoded by any polynucleotide or polynucleotide analog which hybridizes under stringent 
conditions to any polynucleotide comprising at least one of the sequences of SEQ ID NO: 2 to 6, 8, 9, 11 to 
16, 18, 19, 21 to 26 or 53 to 75 encoding a polypeptide exhibiting the same biological function as specified for 
the respective sequence in Table 2 or 3 

c) a polypeptide encoded by any polynucleotide or polynucleotide analog the sequence, of which deviates from 
the polynucleotide specified in (a) and (b) due to the generation of the genetic code encoding a polypeptide 
exhibiting the same biological function as specified for the respective sequence in Table 2 or 3 

d) a polypeptide encoded by any polynucleotide or polynucleotide analog which represents a specific fragment, 
derivative or allelic variation of a polynucleotide sequence specified in (a) to (c)_encoding a polypeptide ex- 
hibiting the same biological function as specified for the respective sequence in Table 2 or 3 

e) or a polypeptide comprising at least one of the sequences of SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 44, 

45, 47 to 52 or 76 to 98; 

wherein said reagent is identified by the method of any of the claims 1 7 to 1 9. 

23. A reagent that regulates the activity of a polynucleotide or polynucleotide analog selected from the group consisting 
of: 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences SEQ ID NO: 2 to 6, 8, 
9, 11 to 16, 18, 19, 21 to 26 or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 
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sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 

5 function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c)_encoding a polypeptideexhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

10 

wherein said reagent is identified by the method of any of the claims 17 to 19. 
24. A pharmaceutical composition, comprising: 

15 a) an expression vector containing at least one polynucleotide or polynucleotide analog selected from the 

group consisting of: 

i) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 
to 6, 8. 9, 11 to 16, 18, 19,21 to 26 or 53 to 75; 

20 

ii) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucle- 
otide specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the 
respective sequence in Table 2 or 3 

25 iii) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide 

specified in (a) and (b) due to the generation of the genetic code_encoding a polypeptide exhibiting the 
same biological function as specified for the respective sequence in Table 2 or 3 

- iv) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic 
30 variation of a polynucleotide sequence specified in (a) to (c)_encoding a polypeptide exhibiting the same 

biological function as specified for the respective sequence in Table 2 or 3; 

or the reagent of claim 22 or 23 and a pharmaceutical^ acceptable carrier. 

35 25. A computer-readable medium comprising: 

a) at least one digitally encoded value representing a level of expression of at least one polynucleotide se- 
quence of SEQ ID NO: 2 to 6, 8, 9, 11 to 16,18,19,21 to 26 or 53 to 75 

40 b) al least 2 digitally encoded values representing the levels of expression of at least 2 polynucleotide se- 

quences selected from SEQ ID NO: 1 to 26 or 53 to 75 

in a cell from the a subject at risk for or having malignant neoplasia. 

45 26. A method for the detection of chromosomal alterations characterized in that the relative abundance of individual 
mRNAs, encoded by genes, located in altered chromosomal regions is detected. 

27. A method for the detection of chromosomal alterations characterized in that the copy number of one or more 
chromosomal region(s) is detected by quantitative PCR. 

50 
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